A closer look at javascript addressing, closures, object models and related issues

Precisely because JS is a dynamic language, the addressing of JS is on-site addressing, rather than determined after compilation like C. In addition, JS introduced the this pointer, which is a very troublesome thing because it is "implicitly" passed into the function as a parameter. Let's first look at the example in the "Scope Chain" topic:
var testvar = 'window property';
var o1 = {testvar:'1', fun:function(){alert('o1: ' this .testvar);}};
var o2 = {testvar:'2', fun:function(){alert('o2: ' this.testvar);}};
o1.fun(); / / '1'
o2.fun(); // '2'
o1.fun.call(o2); // '2' The three alert results are not the same. It's interesting, isn't it? In fact, all interesting and weird concepts can be boiled down to one issue, and that is addressing.
Addressing of simple variables
Is JS static or dynamic scope?
To tell you some unfortunate news, JS is statically scoped, or in other words, variable addressing is much more complicated than dynamically scoped languages like Perl. The following code is an example of programming language principles:
01| function big(){
02| var x = 1;
03| eval('f1 = function(){echo(x) }');
04| function f2(){var x = 2;f1()};
05| f2();
06| };
07| big();
The output is 1, which is exactly the same as pascal and ada, although f1 is dynamically defined using eval. Another example also comes from programming language principles:
function big2(){
var x = 1;
function f2(){echo(x)}; //Use the value of x to generate an output
function f3(){var x = 3;f4(f2)};
function f4(f){var x = 4;f()};
f3();
}
big2();//Output 1: Deep binding; Output 4: Shallow binding; Output 3: Special binding
The output is still 1, indicating that JS is not only a static scope, but also a deep binding, now Something has gone wrong...
The concept of ARI
In order to explain the complex addressing problems at runtime of functions (especially in languages that allow nested functions, such as Ada), it is defined in the book "Principles of Programming Languages" "ARI": It is some records on the stack, including:
Function address
Local variables
Return address
Dynamic link
Static link
Here, the dynamic link always points to a certain The caller of the function (for example, a is called when b is executed, then in the ARI of a, the dynamic link points to b); the static link describes the parent element when a is defined. Because the function is organized as a rooted tree, all static links After summary, it will definitely point to the host (such as window). We can look at the example (output after comments):
var x = 'x in host';
function a(){echo(x)};
function b(){var x = 'x inside b';echo(x)};
function c(){var x = 'x inside c';a()};
function d() {
var x = 'x inside d,a closure-made function';
return function(){echo(x)}};
a();// x in host
b ();// x inside b
c();// x in host
d()();// x inside d, a closure-made function is called in the first sentence, we can see There is the following content on the "stack" (the top of the stack is on the left):
[ARI of a] → [Host] The static link of A goes straight to the host, because x is not defined in a, and the interpreter looks for x When, x is found in the host along the static chain; when calling b, because x is recorded in the local variable of b, the final echo is x inside b: 'x inside b';
Now , the situation of c is much more interesting. When calling c, you can write the stack information like this:
Dynamic chain: [a]→[c]→[Host]
Static chain: [c]→[Host]; [a]→[Host]
Because the addressing of x is performed after calling a, the static link still points directly to the host. Naturally, x is still 'x in host'!
The situation of d is even more interesting. d creates a function as a return value, and it is called immediately~ Because the return value of d is created within the life cycle of d, the return value of d is static The link points to d, so when called, x in d is output: 'x inside d, a closure-made function'.
Timing of creating static links
Yueying and amingoo said that "closure" is the "call-time reference" of a function. "Principles of Programming Languages" simply calls it ARI, but the difference is that "Program The ARI in "Principles of Design Languages" is stored on the stack, and once the life cycle of the function ends, the ARI is destroyed; but this is not the case with JS closures. The closure is destroyed if and only if there is no point pointing to it and its members. (or rather, no code can find it). We can simply think of the function ARI as an object, just wrapped in the "clothes" of a function.
The static chain described in "Principles of Programming Languages" is created when called. However, the relationship between the static chains is determined when the code is compiled. For example, the following code:
PROCEDURE a;
PROCEDURE b;
END
PEOCEDURE c;
END
END In
, the static links of b and c point to a . If b is called, and a variable in b is not among the local variables of b, the compiler generates a piece of code, which hopes to search the stack up along the static chain until a variable or RTE is found.
Unlike compiled languages such as ada, JS is a fully interpreted language, and functions can be created dynamically, which poses the problem of "static chain maintenance". Fortunately, JS functions cannot be modified directly. It is just like the symbols in erl. Change equals redefinition. Therefore, the static link only needs to be updated every time it is defined.Regardless of whether the definition method is function(){} or eval assignment, the static chain is fixed after the function is created.
Let's go back to the big example. When the interpreter runs to "function big(){...}", it creates a function instance in memory and connects it to the host statically. However, when called on the last line, the interpreter draws an area in memory as an ARI. We might as well be ARI[big]. The execution pointer moves to line 2.
When execution reaches line 3, the interpreter creates an instance of "f1", saves it in ARI[big], and connects the static link to ARI[big]. Next line. The interpreter creates an instance of "f2" and connects the static chain. Then, at line 5, f2 is called to create ARI[f1]; f2 calls f1 to create ARI[f1]; if f1 wants to output x, it needs to address x.
Addressing of simple variables
We continue, now we need to address x, but x does not appear in the local variable of f1, so the interpreter must search up the stack to find x, from the output Look, the interpreter is not searching along the "stack" layer by layer, but there are jumps, because at this time the "stack" is:
|f1 | ←Thread pointer
|f2 | x = 2
|big | x = 1
|HOST|
If the interpreter really searches layer by layer along the stack, the output will be 2. This touches on the essence of Js variable addressing: searching along the static chain.
Continuing with the above problem, the execution pointer searches along the static chain of f1 and finds big. It happens that big has x=1, so 1 is output, and everything is fine.
So, will the static links form a loop, causing an "infinite loop" of addressing? Don’t worry, because remember functions are nested within each other? In other words, functions form a rooted tree, and all static chain pointers must eventually be aggregated to the host. Therefore, it is ridiculous to worry about "pointer loops". (On the contrary, dynamic scope language addressing can easily cause an infinite loop.)
Now, we can summarize the method of simple variable addressing: the interpreter now looks for the variable name in the local variables of the current function. If not found, it follows The static chain traces back until the variable is found or traced back to the host and the variable is still not found.
The life of ARI
Now let’s take a look at ARI. ARI records the local variables (including parameters), this pointer, dynamic chain and most importantly - the address of the function instance when the function is executed. We can imagine that ARI has the following structure:
ARI :: {
variables :: *variableTable, //Variable table
dynamicLink :: *ARI, //Dynamic link
instance :: *funtioninst //Function instance
}
variables include all local variables, parameters and this pointer; dynamicLink points to the ARI being called by it; instance points to the function instance. In the function instance, there are:
functioninst:: {
source:: *jsOperations, //Function instructions
staticLink:: *ARI, //Static link
......
}
When the function is called, the following "formal code" is actually executed:
*ARI p;
p = new ARI();
p->dynamicLink = thread. currentARI;
p->instance = called function
p->variables.insert (parameter list, this reference)
thread.transfer(p->instance->operations[0])
Did you see it? Create the ARI, push the parameters and this into the variable table, and then transfer the thread pointer to the first instruction of the function instance.
What about when the function is created? After the function instruction is assigned, we also need:
newFunction->staticLink = thread.currentARI;
Now the problem is clear, we created a static link when the function was defined, which directly points to the current ARI of the thread. This can explain almost all simple variable addressing problems. For example, the following code:
function test(){
for(i=0;i(function(t){ //This anonymous function is tentatively called f
setTimeout(function(){echo('' t)},1000) //The anonymous function here is called g
})(i)
}
}
test()
This The effect of this code is to output in the order of 0 1 2 3 4 after a delay of 1 second. Let's focus on the function that setTimeout acts on. When it is created, the static link points to the anonymous function f. The variable table of f (of an ARI) contains i (the parameters are regarded as local variables). Therefore, when setTimeout expires, the anonymous function g searches for variable t, which is found in the ARI of anonymous function f. Therefore, 0 1 2 3 4 is output one by one in the order of creation.
There are a total of 5 ARIs for the function instance of the public anonymous function f (remember that ARI is created once every time the function is called?). Correspondingly, g is also "created" 5 times.Before the first setTimeout expires, there are equivalent to the following records in the stack (I wrote g separately into 5):
ARI of test [i=5 at the end of the loop]
| ARI of f; t =0 ←——————Static link of g0
| aRI of f; t=1 ←——————Static link of g1
| aRI of f; t=2 ←———— ————Static link of g2
| aRI of f; t=3 ←——————Static link of g3
| aRI of f; t=4 ←——————Static of g4 Link
------
And when g0 is called, the "stack" looks like the following:
ARI of test [i=5 at the end of the loop]
| ARI of f; t=0 ←——————Static link of g0
| ARI of f；t=1 ←——————Static link of g1
| ARI of f；t=2 ←—— ————Static link of g2
| ARI of f; t=3 ←——————Static link of g3
| ARI of f; t=4 ←——————g4 Static link
------
ARI of g0
| Here we need to address t, so... t=0
------
ARI of g0 is possible It is not in the ARI of the f series and can be regarded as being placed directly in the host; however, the static links that the addressing concerns are still poked at the ARI of each f, so there will naturally be no mistakes ~ because setTimeout is sequentially pushed into the waiting queue. So the final output is in the order of 0 1 2 3 4.
Will the static link be modified when the function is redefined?
Now let’s look at the next question: When a function is defined, a static link will be established. Then, when the function is redefined, will another static link be established? Let’s look at the example first:
var x = "x in host";
f = function(){echo(x)};
f();
function big(){
var x = 'x in big';
f();
f = function(){echo (x)};
f()
}
big()
Output :
x in host
x in host
x in big
This example may be easier to understand. When big runs, f in the host is redefined, and the static link of the "new" f points to big, so the last line outputs 'x in big'.
However, the following example is much more interesting:
var x = "x in host";
f = function(){echo(x)};
f();
function big(){
var x = 'x in big';
f();
var f1 = f;
f1();
f = f;
f ()
}
big()
Output:
x in host
x in host
x in host
x in host
does not mean that redefinition will Modify the static link? However, the two assignments here are just assignments, and only the pointers of f1 and f are modified (remember that JS functions are reference types?). In the real instance of f, the static link has not changed! . So, the four outputs are actually x in the host.
The problem of component (attribute) addressing in the structure (object)
People from Christianity (java) and Mormonism (csh) please forgive me for using this strange name, but JS objects are too similar to Hash As shown in the table, let’s consider this addressing problem:
a.b compiled language will generate code that finds a and then offsets it a certain distance backward to find b. However, JS is a fully dynamic language, and the members of the object can be increased or decreased at will. The problem of prototypes makes the addressing of JS object members very interesting.
Objects are hash tables
Except for a few special methods (and prototype members), objects are almost the same as hash tables, because methods and properties can be stored in the "lattice" of the "hash table" "in. Yue version implemented a HashTable class in his "JS Return of the King".
Addressing the properties of the object itself
The "own" properties refer to those properties where hasOwnProperty is true. From an implementation perspective, it is the members of the object's own "hash table". For example:
function Point(x,y){
this.x = x;
this.y = y;
}
var a = new Point(1,2);
echo("a.x:" a.x)
The Point constructor creates the "Point" object a and sets the x and y attributes; therefore, in the member table of a, there is:
| x | - --> 1
| y | ---> 2
When searching for a.x, the interpreter first finds a, then searches for x in the member table of a, and gets 1.
Setting methods on objects from the constructor is not a good strategy because it will cause two objects of the same type to have unequal methods:
function Point(x,y){
this.x = x;
this.y = y;
this.abs = function(){return Math.sqrt(this.x*this.x this.y*this.y)}
}
var a = new Point (1,2);
var b = new Point(1,2);
echo("a.abs == b.abs ? " (a.abs==b.abs));
echo("a.abs === b.abs ? " (a.abs===b.abs));
Both outputs are false, because in the fourth line, the object’s abs member (method ) is created each time, so a.abs and b.abs actually point to two completely different function instances. Therefore, two methods that appear to be equal are actually not equal.
Bringing to the issue of prototype addressing
Prototype is an attribute of a function (class), which points to an object (not a class). The idea of "prototype" can be compared to "drawing a tiger from a cat": there is no relationship between a "tiger" and a "cat" that inherits the other, only a relationship between "tiger" and "cat". The prototype focuses on similarity. In js, the code can be written as:
Tiger.prototype = new Cat() The prototype of the function can also be just a blank object:
SomeClass.prototype = {} Let’s go back to addressing , what if you use . to get a certain attribute, and it happens to be an attribute in the prototype? The phenomenon is: it was indeed obtained, but how was it obtained? What if the properties of the object itself have the same name as the prototype properties? Fortunately, the properties of the object itself take precedence.
프로토타입에서 방법을 정의하는 것은 좋은 디자인 전략입니다. 위의 예를 변경하면:
function Point(x,y){
this.x = x;
this.y = y
}
Point.prototype.abs = function (){return Math.sqrt(this.x*this.x this.y*this,y)}
var a = new Point(1,2)
var b = new Point(1, 2) );
echo("a.abs == b.abs ? " (a.abs==b.abs))
echo("a.abs === b.abs ? " (a . abs===b.abs));
이제 출력은 최종적으로 동일합니다. 그 이유는 a.abs와 b.abs가 Point 클래스 프로토타입의 멤버 abs를 가리키므로 출력이 동일하기 때문입니다. 하지만 Point.prototype.abs에 직접 접근할 수는 없으며, 테스트 중 오류가 발생하게 됩니다. 수정: 다시 테스트한 결과 "Point.prototype.abs에 액세스할 수 없습니다." 문제는 제가 사용한 JSConsole의 문제입니다. 답변이 정확합니다. 수정해 주셔서 감사합니다!
프로토타입 체인은 매우 길 수도 있고 고리 모양으로 감을 수도 있습니다. 다음 코드를 고려하세요.
A = function(x){this.x = x};
B = function(x){this.y = x}
A.prototype = new B(1 );
B.prototype = new A(1);
var a = new A(2)
echo(a.x ' , ' a.y)
var b = new B(2) ;
echo(b.x ' , ' b.y);
설명된 관계는 아마도 "나는 너와 같고 너는 나와 같다"일 것입니다. 프로토타입 포인터는 다음과 같은 출력을 발생시킵니다:
2, 1
1, 2
a.y를 검색할 때 프로토타입 체인을 따라 "a.prototype"이 발견되었으며 출력은 1이었습니다. b.x로 . 이제 등록되지 않은 속성 "a.z"를 출력하려고 합니다.
echo(tyoeof a.z) 여기서는 무한 루프가 없다는 사실에 놀랐습니다. 인터프리터에는 프로토타입 체인이 되는 문제를 처리하는 메커니즘이 있는 것 같습니다. 루프. 동시에 프로토타입은 트리 또는 단일 링을 형성하며 다중 링 구조를 갖지 않습니다. 이는 매우 간단한 그래프 이론입니다.
이것: 함수의 숨겨진 규칙
메서드(함수) 호출에서 가장 골치 아픈 숨겨진 규칙이 바로 이 문제입니다. 논리적으로 말하면 이는 호출자(객체)를 가리키는 포인터입니다. 그러나 이것이 항상 호출자를 가리킨다면 세상은 멋질 것입니다. 하지만 이 빌어먹을 포인터는 때때로 "개를 걷어차버릴" 것입니다. 가능한 수정에는 호출, 적용, 비동기 호출 및 "window.eval"이 포함됩니다.
저는 Lua의 self처럼 이것을 매개변수로 처리하는 것을 선호합니다. Lua의 self는 명시적으로 전달되거나 콜론을 사용하여 호출될 수 있습니다:
a:f(x,y,z) === a.f(a,x,y,z) JS의 "프라임" 메서드 호출에도 마찬가지입니다. 이는 다음과 같습니다:
a.f(x,y,z) === a.f.call(a,x,y,z)f.call은 Lua의 클린 호출과 마찬가지로 진정한 "클린" 호출 형식입니다. 많은 사람들은 Lua가 JS의 명확한 버전이라고 말합니다. Lua는 JS의 많은 것을 단순화하고 JS의 많은 숨겨진 규칙을 드러냅니다.
'이것'을 수정하는 원리
'왕의 귀환'에서 위에서 언급한 '이것을 수정하려면 클로저를 사용하세요', 먼저 코드를 살펴보세요:
button1.onclick = (
function( e){return function (){button_click.apply(e,arguments)}}
)(button1) 이 코드 줄을 과소평가하지 마십시오. 실제로 ARI를 생성하고 여기에 버튼1을 바인딩한 다음 함수는 e가 호출자(주제)에 대해 버튼_클릭을 호출하도록 강제하므로 버튼_클릭에 전달된 것은 e, 즉 버튼1입니다! 이벤트 바인딩이 완료된 후 환경은 다음과 같을 것입니다.
button1.onclick = _F_; //반환된 익명 함수의 이름 설정
_F_.staticLink = _ARI_ //다음에 호출되는 익명 함수 ARI 생성
_ARI_[e] = 버튼1 //익명 ARI 매개변수 테이블의 e는 _F_가 찾고 있는 e이기도 합니다.
따라서 버튼을 클릭하면 _F_가 호출되고 _F_가 One을 시작합니다. caller는 e의 버튼_클릭 함수입니다. 이전 분석에 따르면 e는 버튼1과 동일하므로 안전한 "지정된 호출자" 메서드를 얻습니다. 아마도 우리는 이 아이디어를 계속 발전시켜 보편적인 인터페이스를 만들 수 있을 것입니다.
bindFunction = function(f,e){ //우리는 좋은 사람들입니다. 프로토타입을 바꾸지 않고, 바꾸지도 않습니다...
반환 함수(){
f.apply(e,arguments)
}
}

A closer look at javascript addressing, closures, object models and related issues_javascript skills