diff options
Diffstat (limited to 'js/src/doc/JITOptimizations/Strategies.md')
-rw-r--r-- | js/src/doc/JITOptimizations/Strategies.md | 582 |
1 files changed, 582 insertions, 0 deletions
diff --git a/js/src/doc/JITOptimizations/Strategies.md b/js/src/doc/JITOptimizations/Strategies.md new file mode 100644 index 000000000..d4feef310 --- /dev/null +++ b/js/src/doc/JITOptimizations/Strategies.md @@ -0,0 +1,582 @@ +# JIT Optimization Strategies + +SpiderMonkey's optimizing JIT, IonMonkey, uses a number of different +optimization strategies to speed up various operations. The most commonplace +operations that are relevant for fast program execution are property accesses +and function calls. + +Optimization information is currently collected for the following operations: + +- [BinaryArith](#binaryarith) (`+-/*%`) +- [GetProperty](#getprop) (`obj.prop`) +- [SetProperty](#setprop) (`obj.prop = val`) +- [GetElement](#getelem) (`obj[elemName]`) +- [SetElement](#setelem) (`obj[elemName] = val`) +- [Call](#call) (`func(...)`) + +At each operation site, IonMonkey tries a battery of <i>strategies</i>, from +the most optimized but most restrictive to the least optimized but least +restrictive. For each strategy attempted, its <i>outcome</i> is tracked. An +outcome is either success or a reason why the strategy failed to apply. + +This page documents the various optimization strategies and their outcomes. It +provides information on what they attempt to do, what general level of +speed-up they provide, what kind of program characteristics can prevent them +from being used, and common ways to enable the engine to utilize that +optimization. + +## <a name="binaryarith"></a>BinaryArith + +### BinaryArith_Concat + +Attempts to optimize an addition to string concatenation. The types of the operands are checked to contain a hint of being a concatenation. Both have to be string or one has to be a string and the other a type that easily can get converted to string (like numbers). + +### BinaryArith_SpecializedTypes + +Attempts to do a numeric arithmetic based on operand types. One of the inputs need to be a numeric type and the other a simple arithmetic type. + +### BinaryArith_SpecializedOnBaselineTypes + +Just like BinrayArith_SpecializedTypes tries to do a numeric arithmetic, but based on the observed types in the Baseline Compiler. If it succeeds it will roughly give same optimization as BinaryArith_SpecializedTypes, but the deduction is more fragile. In the best case one should try to get this optimization based on the input types. + +### BinaryArith_SharedCache + +Attempts to optimize a binary arithmetic using inline cache. + +### BinaryArith_Call + +Last resort which cannot get specialized at all and takes the slow path to do the arithmetic. + + +## <a name="getprop"></a>GetProperty + +### GetProp_ArgumentsLength + +Attempts to optimize an `arguments.length` property access. This optimization +only works if the arguments object is used in well-understood ways within the +function. The function containing the arguments.length is allowed to use the +arguments object in the following ways without disabling this optimization: + +- Access `arguments.length` +- Access `arguments.callee` +- Access individual args using `arguments[i]` +- Save arguments into variables, as long as those variables cannot be accessed + by any nested function, and as long as there exists no eval nywhere within + the function or nested function definitions. +- Call a function using `f.apply(obj, arguments)` + +If the function contains any use of the arguments object that falls out of the +cases defined above, this optimization will be suppressed. In particular, +`arguments` cannot be returned from the function, or passed as an argument into +calls (except for the `apply` case above). + +### GetProp_ArgumentsCallee + +Attempts to optimize an `arguments.callee` property access. This optimization +only works if the arguments object is used in well-understood ways within the +function. The function containing the `arguments.callee` is allowed to use the +arguments object in the following ways without disabling this optimization: + +- Access arguments.length +- Access arguments.callee +- Access individual args using arguments[i] +- Save arguments into variables, as long as those variables cannot be accessed + by any nested function, and as long as there exists no eval nywhere within + the function or nested function definitions. +- Call a function using `f.apply(obj, arguments)` + +If the function contains any use of the `arguments` object that falls out of +the cases defined above, this optimization will be suppressed. In particular, +arguments cannot be returned from the function, or passed as an argument into +calls (except for the `apply` example listed above). + +### GetProp_InferredConstant + +Attempts to optimize an access to a property that seems to be a constant. It +applies to property accesses on objects which are global-like in that there is +only one instance of them per program. This includes global objects, object +literals defined at the top-level of a script, and top-level function objects. + +This optimization makes the assumption that a property that has not changed +after it was first assigned, is likely a constant property. It then directly +inlines the value of the property into hot code that accesses it. For +example, in the following code: + + var Constants = {}; + Constants.N = 100; + + function testArray(array) { + for (var i = 0; i < array.length; i++) { + if (array[i] > Constants.N) + return true; + } + return false; + } + +Will have the loop compiled into the following when `testArray` gets hot. + + for (var i = 0; i < array.length; i++) { + if (array[i] > 100) + return true; + } + +When this optimization is successful, property access is eliminated entirely +and replaced with an inline constant. + +### GetProp_Constant + +Attempts to optimize reading a property that contains a uniquely-typed (or +"singleton") object. With uniquely-typed objects, it is guaranteed that +no other object has that same type. Unique (or "singleton") types are +assigned to certain kinds of objects, like global objects, top-level +functions, and object literals declared at the top-level of a script. If +a property has always contained the same uniquely-typed object, then the +engine can use the unique type to map back to a specific object, and +eliminate the property access, replacing it with a constant reference to +the object. + +When this optimization is successful, property access is eliminated entirely +and replaced with an inline constant. The different success and failure +conditions are documented below: + +### GetProp_StaticName + +Attempts to optimize a property access on `window` which refers to +a property on the global object. + +### GetProp_TypedObject + +Optimizes accesses to properties on TypedObjects. + +### GetProp_DefiniteSlot + +Optimizes access to a well-known regular property on an object. For this +optimization to succeed, the property needs to be well-defined on the object. +For objects constructed by constructor functions, this means that the property +needs to be defined in the constructor, before any complex logic occurs within +the constructor. + +This is the best case for a regular "field" type property that is not +turned into a constant. It compiles down to a single CPU-level load +instruction. + + function SomeConstructor() { + this.x = 10; // x is a definite slot property + this.y = 10; // y is a definite slot property + someComplicatedFunctionCall(); + this.z = 20; // z is not a definite slot property. + } + +In the above example, the properties `x` and `y` can be determined to always +exist on any instance of `SomeConstructor` at definite locations, allowing +the engine to deterministically infer the position of `x` without a shape +guard. + +This optimization can fail for a number of reasons. If the types observed +at the property access are polymorphic (more than one type), this optimization +cannot succeed. Furthermore, even if the object type is monomorphic, the +optimization will fail if the property being accessed is not a definite +slot as described above. + +### GetProp_Unboxed + +Similar to `GetProp_DefiniteSlot`. Unboxed property reads are possible on +properties which satisfy all the characteristics of a definite slot, and +additionally have been observed to only store values of one kind of value. + +Consider the following constructor: + + function Point(x, y) { + this.x = x; + this.y = y; + } + +If only integers are ever stored in the `x` and `y` properties, +then the instances of `Point` will be represented in an "unboxed" mode - +with the property values stored as raw 4-byte values within the object. + +Objects which have the unboxed optimization are more compact. + +### GetProp_CommonGetter + +Optimizes access to properties which are implemented by a getter function, +where the getter is shared between multiple types. + +This optimization applies most often when the property access site is +polymorphic, but all the object types are derived variants of a single +base class, where the property access refers to a getter on the base +class. + +Consider the following example: + function Base() {} + Base.prototype = { + get x() { return 3; } + }; + + function Derived1() {} + Derived1.prototype = Object.create(Base.prototype); + + function Derived2() {} + Derived1.prototype = Object.create(Base.prototype); + +If a property access for `d.x` sees only instances of both `Derived1` and +`Derived2` for `d`, it can optimize the access to `x` to a call to the +getter function defined on `Base`. + +This optimization applies to shared getters on both pure JS objects as well +as DOM objects. + +### GetProp_InlineAccess + +Optimizes a polymorphic property access where there are only a few different +types of objects seen, and the property on all of the different types is +determinable through a shape-check. + +If a property access is monomorphic and the property's location is determinable +from the object's shape, but the property is not definite (see: +GetProp_DefiniteProperty), then this optimization may be used. + +Alternatively, if the property access is polymorphic, but only has a few +different shapes observed at the access site, this optimization may be used. + +This optimization compiles down to one-or more shape-guarded direct loads +from the object. The following pseudocode describes the kind of machine +code generated by this optimization: + + if obj.shape == Shape1 then + obj.slots[0] + elif obj.shape == Shape2 then + obj.slots[5] + elif obj.shape == Shape3 then + obj.slots[2] + ... + end + +### GetProp_Innerize + +Attempts to optimize a situation where a property access of the form +`window.PROP` can be directly translated into a property access on +the inner global object. + +This optimization will always fail on property accesses which are +not on the window object. + +It is useful because accessing global names via the 'window' object +is a common idiom in web programming. + +### GetProp_InlineCache + +This is the worst-case scenario for a property access optimization. This +strategy is used when all the others fail. The engine simply inserts +an inline cache at the property access site. + +Inline caches start off as a jump to a separate piece of code called +a "fallback". The fallback calls into the interpreter VM (which is +very slow) to perform the operation, and then decides if the operation +can be optimized in that particular case. If so, it generates a new +"stub" (or freestanding piece of jitcode) and changes the inline cache +to jump to the stub. The stub attempts to optimize further occurrences +of that same kind of operation. + +Inline caches are an order of magnitude slower than the other optimization +strategies, and are an indication that the type inference engine has +failed to collect enough information to guide the optimization process. + +## <a name="setprop"></a>SetProperty + +### SetProp_CommonSetter + +Optimizes access to properties which are implemented by a setter function, +where the setter is shared between multiple types. + +This optimization applies most often when the property access site is +polymorphic, but all the object types are derived variants of a single +base class, where the property access refers to a setter on the base +class. + +Consider the following example: + function Base() {} + Base.prototype = { + set x(val) { ... } + }; + + function Derived1() {} + Derived1.prototype = Object.create(Base.prototype); + + function Derived2() {} + Derived1.prototype = Object.create(Base.prototype); + +If a property write for `d.x = val` sees only instances of both `Derived1` and +`Derived2` for `d`, it can optimize the write to `x` to a call to the +setter function defined on `Base`. + +This optimization applies to shared setters on both pure JS objects as well +as DOM objects. + +### SetProp_TypedObject + +Optimizes accesses to properties on TypedObjects. + +### SetProp_DefiniteSlot + +Optimizes a write to a well-known regular property on an object. For this +optimization to succeed, the property needs to be well-defined on the object. +For objects constructed by constructor functions, this means that the property +needs to be defined in the constructor, before any complex logic occurs within +the constructor. + +This is the best case for a regular "field" type property that is not +turned into a constant. It compiles down to a single CPU-level load +instruction. + + function SomeConstructor() { + this.x = 10; // x is a definite slot property + this.y = 10; // y is a definite slot property + someComplicatedFunctionCall(); + this.z = 20; // z is not a definite slot property. + } + +In the above example, the properties `x` and `y` can be determined to always +exist on any instance of `SomeConstructor` at definite locations, allowing +the engine to deterministically infer the position of `x` without a shape +guard. + +This optimization can fail for a number of reasons. If the types observed +at the property access are polymorphic (more than one type), this optimization +cannot succeed. Furthermore, even if the object type is monomorphic, the +optimization will fail if the property being written is not a definite +slot as described above. + +### SetProp_Unboxed + +Similar to `SetProp_DefiniteSlot`. Unboxed property writes are possible on +properties which satisfy all the characteristics of a definite slot, and +additionally have been observed to only store values of one kind of value. + +Consider the following constructor: + + function Point(x, y) { + this.x = x; + this.y = y; + } + +If only integers are ever stored in the `x` and `y` properties, +then the instances of `Point` will be represented in an "unboxed" mode - +with the property values stored as raw 4-byte values within the object. + +Objects which have the unboxed optimization are more compact. + +### SetProp_InlineAccess + +Optimizes a polymorphic property write where there are only a few different +types of objects seen, and the property on all of the different types is +determinable through a shape-check. + +If a property write is monomorphic and the property's location is determinable +from the object's shape, but the property is not definite (see: +GetProp_DefiniteProperty), then this optimization may be used. + +Alternatively, if the property write is polymorphic, but only has a few +different shapes observed at the access site, this optimization may be used. + +This optimization compiles down to one-or more shape-guarded direct stores +to the object. The following pseudocode describes the kind of machine +code generated by this optimization: + + if obj.shape == Shape1 then + obj.slots[0] = val + elif obj.shape == Shape2 then + obj.slots[5] = val + elif obj.shape == Shape3 then + obj.slots[2] = val + ... + end + +### SetProp_InlineCache + +This is the worst-case scenario for a property access optimization. This +strategy is used when all the others fail. The engine simply inserts +an inline cache at the property write site. + +Inline caches start off as a jump to a separate piece of code called +a "fallback". The fallback calls into the interpreter VM (which is +very slow) to perform the operation, and then decides if the operation +can be optimized in that particular case. If so, it generates a new +"stub" (or freestanding piece of jitcode) and changes the inline cache +to jump to the stub. The stub attempts to optimize further occurrences +of that same kind of operation. + +Inline caches are an order of magnitude slower than the other optimization +strategies, and are an indication that the type inference engine has +failed to collect enough information to guide the optimization process. + +## <a name="getelem"></a>GetElement + +### GetElem_TypedObject + +Attempts to optimized element accesses on array Typed Objects. + +### GetElem_Dense + +Attempts to optimize element accesses on densely packed array objects. Dense +arrays are arrays which do not have any 'holes'. This means that the array +has valid values for all indexes from `0` to `length-1`. + +### GetElem_TypedStatic + +Attempts to optimize element accesses on a typed array that can be determined +to always refer to the same array object. If this optimization succeeds, the +'array' object is treated as a constant, and is not looked up or retrieved from +a variable. + +### GetElem_TypedArray + +Attempts to optimize element accesses on a typed array. + +### GetElem_String + +Attempts to optimize element accesses on a string. + +### GetElem_Arguments + +Attempts to optimize element accesses on the `arguments` special object +available in functions. This optimization only works if the arguments +object is used in well-understood ways within the function. The function +containing the arguments.length is allowed to use the arguments object in +the following ways without disabling this optimization: + +- Access `arguments.length` +- Access `arguments.callee` +- Access individual args using `arguments[i]` +- Save `arguments` into variables, as long as those variables cannot be + accessed by any nested function, and as long as there exists no `eval` + anywhere within the function or nested function definitions. +- Call a function using `f.apply(obj, arguments)` + +If the function contains any use of the arguments object that falls out of the +cases defined above, this optimization will be suppressed. In particular, +`arguments` cannot be returned from the function, or passed as an argument into +calls (except for the `apply` case above). + +### GetElem_ArgumentsInlined + +Similar to GetEelem_Arguments, but optimizes cases where the access on +`arguments` is happening within an inlined function. In these cases, an +access of the form `arguments[i]` can be directly translated into a +direct reference to the corresponding argument value in the inlined call. + +Consider the following: + function foo(arg) { + return bar(arg, 3); + } + function bar() { + return arguments[0] + arguments[1]; + } + +In the above case, if `foo` is compiled with Ion, and the call to `bar` +is inlined, then this optimization can transform the entire procedure to +the following pseudo-code: + + compiled foo(arg): + // inlined call to bar(arg, 3) { + return arg + 3; + // } + +### GetElem_InlineCache + +This is the worst-case scenario for a element access optimization. This +strategy is used when all the others fail. The engine simply inserts +an inline cache at the property write site. + +Inline caches start off as a jump to a separate piece of code called +a "fallback". The fallback calls into the interpreter VM (which is +very slow) to perform the operation, and then decides if the operation +can be optimized in that particular case. If so, it generates a new +"stub" (or freestanding piece of jitcode) and changes the inline cache +to jump to the stub. The stub attempts to optimize further occurrences +of that same kind of operation. + +Inline caches are an order of magnitude slower than the other optimization +strategies, and are an indication that the type inference engine has +failed to collect enough information to guide the optimization process. + +## <a name="setelem"></a>SetElement + +### SetElem_TypedObject + +Attempts to optimized element writes on array Typed Objects. + +### SetElem_TypedStatic + +Attempts to optimize element writes on a typed array that can be determined +to always refer to the same array object. If this optimization succeeds, the +'array' object is treated as a constant, and is not looked up or retrieved from +a variable. + +### SetElem_TypedArray + +Attempts to optimize element writes on a typed array. + +### SetElem_Dense + +Attempts to optimize element writes on densely packed array objects. Dense +arrays are arrays which do not have any 'holes'. This means that the array +has valid values for all indexes from `0` to `length-1`. + +### SetElem_Arguments + +Attempts to optimize element writes to the `arguments` special object +available in functions. This optimization only works if the arguments +object is used in well-understood ways within the function. The function +containing the arguments.length is allowed to use the arguments object in +the following ways without disabling this optimization: + +- Access `arguments.length` +- Access `arguments.callee` +- Access individual args using `arguments[i]` +- Save `arguments` into variables, as long as those variables cannot be + accessed by any nested function, and as long as there exists no `eval` + anywhere within the function or nested function definitions. +- Call a function using `f.apply(obj, arguments)` + +If the function contains any use of the arguments object that falls out of the +cases defined above, this optimization will be suppressed. In particular, +`arguments` cannot be returned from the function, or passed as an argument into +calls (except for the `apply` case above). + +### SetElem_InlineCache + +This is the worst-case scenario for a element write optimization. This +strategy is used when all the others fail. The engine simply inserts +an inline cache at the property write site. + +Inline caches start off as a jump to a separate piece of code called +a "fallback". The fallback calls into the interpreter VM (which is +very slow) to perform the operation, and then decides if the operation +can be optimized in that particular case. If so, it generates a new +"stub" (or freestanding piece of jitcode) and changes the inline cache +to jump to the stub. The stub attempts to optimize further occurrences +of that same kind of operation. + +Inline caches are an order of magnitude slower than the other optimization +strategies, and are an indication that the type inference engine has +failed to collect enough information to guide the optimization process. + +## <a name="call"></a>Call + +### Call_Inline + +A function call `f(x)` usually pushes a frame onto the call stack. Inlining a +call site conceptually copies the body of the callee function and pastes it +in place of the call site and avoids pushing a new execution frame. Usually, +hot functions do well to be inlined. This is one of the most important +optimizations the JIT performs. + +Ion inlines both interpreted (i.e., written in JavaScript) functions and +native (i.e., built-ins such as `Math.sin` implemented in C++). + +A successfully inlined call site has the outcome Inlined. + +Failure to inline comes in two flavors: unable (e.g., unable to determine +exact callee) and unwilling (e.g., heuristics concluded that the time-space +tradeoff will not pay off). |