コンパイラ作成(66) ブロック内での変数宣言
今回の目標
ブロック{...}を追加して、その中での変数宣言にも対応するよ。
// block中の変数宣言 extern int puts(char *str); int main() { int a = 10, b = 20; printf("1: a = %d b = %d\n",a,b); { int a = 55; printf("2: a = %d b = %d\n",a,b); } int c = 30; printf("3: a = %d b = %d c = %d\n",a,b,c); a = 100; printf("4: a = %d b = %d c = %d\n",a,b,c); }
頑張ってwhile文とかのブロック中での変数宣言にも対応するよ。
initialize
ブロック情報管理用の変数を二つ追加。
# コンストラクタ def initialize(fname) @fname = fname # ソースファイルのファイル名 @asmfname = fname.sub(/\.myc$/,'.s') # アセンブリコードのファイル名 @regs32 = ["edi", "esi","edx","ecx","r8d","r9d"] # 32bitレジスタ @regs64 = ["rdi", "rsi","rdx","rcx","r8", "r9" ] # 64bitレジスタ @lex = Lexer.new(@fname) # 字句解析 @funcname = nil # 現在処理している関数名 @labelcnt = nil # 自動生成するラベルの個数(関数単位) @literalcnt = 0 # 文字列リテラルの数 @literaltable = [] # 文字列リテラルのリスト @functions = Hash.new # 関数 @lvars = nil # ローカル変数 @lvarsize = nil # スタックに確保する領域のサイズ @breaklabel = nil # breakの飛び先のラベル @codebuffer = [] # コードバッファ @numuseregs = 0 # 関数コールで使用しているレジスタの数 @numblock = nil # blockの個数 @blocks = nil # ネストしたblock(["B2#","B1#",""]) end
ブロックにそれぞれ名前を付ける。B1#、B2#って具合にね。ネストしたブロックを表すのにblocksってArrayを使って、["B2#","B1#",""]みたいに表現するよ。左から順に内側のブロックで一番右の""は関数のブロック。これは名無しにするよ。
ヘルパーメソッド
メソッドを三つ追加。
# 変数情報取り出し def get_var(var) # ネストしたブロックのどこかにあるか? @blocks.each do |blk| v = @lvars[blk + var] if v then return v end end # どこにもなかった return nil end # 変数チェック def check_var(var) # 現在のblockで宣言されているか? return @lvars[@blocks[0] + var] end # 変数情報登録 def set_var(var,info) @lvars[@blocks[0] + var] = info end
ブロックの内側の変数はa=>B1#aって感じで変数名に飾りを付けてlvarsに登録する。その為、今後は変数情報lvarsにアクセスするときはこのメソッドを通して行うよ。
statement
ブロックの処理を追加。
elsif kind == TK::SYMBOL && str == "{" then # blockの処理 @numblock += 1 @blocks.unshift "B#{@numblock}#" kind, str = block @blocks.shift kind, str = @lex.gettoken return kind, str
blockメソッドの前後でblocks情報を弄ってるよ。
elsif kind == TK::RESERVE && str == "if" then # if文の処理 @labelcnt += 1 else_label = ".LBB_" + @funcname + "_" + @labelcnt.to_s @labelcnt += 1 exit_label = ".LBB_" + @funcname + "_" + @labelcnt.to_s kind, str = @lex.gettoken if kind != TK::SYMBOL || str != "(" then perror end kind, str = @lex.gettoken kind, str = expr kind, str if kind != TK::SYMBOL || str != ")" then perror end codegen " jz " + else_label kind, str = @lex.gettoken if kind == TK::SYMBOL && str == "{" then @numblock += 1 @blocks.unshift "B#{@numblock}#" kind, str = block @blocks.shift kind, str = @lex.gettoken else kind, str = statement kind, str end if kind == TK::RESERVE && str == "else" then codegen " jmp " + exit_label codegen else_label + ":" kind, str = @lex.gettoken if kind == TK::SYMBOL && str == "{" then @numblock += 1 @blocks.unshift "B#{@numblock}#" kind, str = block @blocks.shift kind, str = @lex.gettoken else kind, str = statement kind, str end else codegen else_label + ":" end codegen exit_label + ":" return kind, str elsif kind == TK::RESERVE && str == "while" then # while文の処理 @labelcnt += 1 cond_label = ".LBB_" + @funcname + "_" + @labelcnt.to_s @labelcnt += 1 exit_label = ".LBB_" + @funcname + "_" + @labelcnt.to_s breaklabelsave = @breaklabel @breaklabel = exit_label codegen cond_label + ":" kind, str = @lex.gettoken if kind != TK::SYMBOL || str != "(" then perror end kind, str = @lex.gettoken kind, str = expr kind, str if kind != TK::SYMBOL || str != ")" then perror end codegen " jz " + exit_label kind, str = @lex.gettoken if kind == TK::SYMBOL && str == "{" then @numblock += 1 @blocks.unshift "B#{@numblock}#" kind, str = block @blocks.shift kind, str = @lex.gettoken else kind, str = statement kind, str end codegen " jmp " + cond_label codegen exit_label + ":" @breaklabel = breaklabelsave return kind, str elsif kind == TK::RESERVE && str == "for" then # for文の処理 @labelcnt += 1 cond_label = ".LBB_" + @funcname + "_" + @labelcnt.to_s @labelcnt += 1 cont_label = ".LBB_" + @funcname + "_" + @labelcnt.to_s @labelcnt += 1 body_label = ".LBB_" + @funcname + "_" + @labelcnt.to_s @labelcnt += 1 exit_label = ".LBB_" + @funcname + "_" + @labelcnt.to_s breaklabelsave = @breaklabel @breaklabel = exit_label @numblock += 1 @blocks.unshift "B#{@numblock}#" kind, str = @lex.gettoken if kind != TK::SYMBOL || str != "(" then perror end kind, str = @lex.gettoken if kind != TK::SYMBOL || str != ";" then kind, str = expr kind, str end if kind != TK::SYMBOL || str != ";" then perror end codegen cond_label + ":" kind, str = @lex.gettoken if kind != TK::SYMBOL || str != ";" then kind, str = expr kind, str codegen " jz " + exit_label end codegen " jmp " + body_label if kind != TK::SYMBOL || str != ";" then perror end codegen cont_label + ":" kind, str = @lex.gettoken if kind != TK::SYMBOL || str != ")" then kind, str = expr kind, str end if kind != TK::SYMBOL || str != ")" then perror end codegen " jmp " + cond_label codegen body_label + ":" kind, str = @lex.gettoken if kind == TK::SYMBOL && str == "{" then kind, str = block kind, str = @lex.gettoken else kind, str = statement kind, str end codegen " jmp " + cont_label codegen exit_label + ":" @blocks.shift @breaklabel = breaklabelsave return kind, str
if文、while文、for文のブロックでも同じようにやってるよ。
elsif kind == TK::TYPE then # 変数宣言の処理 basetype = str loop do type = basetype kind, str = @lex.gettoken if kind == TK::SYMBOL && str == "*" then type += str kind, str = @lex.gettoken end if kind != TK::ID then perror end print "var "+str+"\n" if $opt_d @lvarsize += sizeof(type) if check_var str then perror "redefinition variable \"" + str +"\"" end set_var str, [type,@lvarsize] skind, sstr = @lex.gettoken if skind == TK::SYMBOL && sstr == "=" then kind, str = expr2 kind, str, skind, sstr; else kind, str = skind, sstr; end if kind != TK::SYMBOL || str != "," then break end end if kind != TK::SYMBOL || str != ";" then perror "expected ';' after variables" end
check_var、set_varで処理するよう変更。
function
引数の処理部。
# 引数の処理 kind, str = @lex.gettoken loop do if kind == TK::SYMBOL && str == ")" then break end if kind == TK::TYPE then if str == "extern" then perror "invalid 'extern'" end type = str kind, str = @lex.gettoken if kind == TK::SYMBOL && str == "*" then type += str kind, str = @lex.gettoken end paratype << type if kind != TK::ID then perror "wrong parameter name" end print "para "+str+"\n" if $opt_d size = sizeof type @lvarsize += size parametersize << size if check_var str then perror "redefinition parameter \"" + str +"\"" end set_var str, [type,@lvarsize] else perror end kind, str = @lex.gettoken if kind == TK::SYMBOL && str == "," then kind, str = @lex.gettoken end end
ここもcheck_var、set_varに変更。
コード生成部
最後にここも修正。
# 代入のコード生成 def codegen_assign(el) if el.size != 3 then perror end type_r = codegen_el [el[2]] if el[0].kind_of?(Array) then perror end if el[0].kind != TK::ID then perror end v = get_var el[0].str if v == nil then perror "undeclared variable \"" + el[0].str + "\"" end type_l = v[0] if type_r == "void*" && is_pointer_type?(type_l) then type_r = type_l # 暗黙の型変換 end if type_l != type_r then perror end if type_l == "char*" then codegen " mov qword ptr [rbp - " + v[1].to_s + "], rax" else codegen " mov dword ptr [rbp - " + v[1].to_s + "], eax" end return type_l end # 式のコード生成(二項演算の左側被演算子) def codegen_elf(operand) type = "int" if operand.kind_of?(Array) then if !operand[0].kind_of?(Array) && operand[0].kind == TK::ID && operand[1].str == "()" then type = codegen_func operand else type = codegen_el operand end elsif operand.kind == TK::NUMBER then codegen " mov eax, " + operand.str elsif operand.kind == TK::ID then v = get_var operand.str if v == nil then perror "undeclared variable \"" + operand.str + "\"" end type = v[0] if type == "char*" codegen " mov rax, qword ptr [rbp - " + v[1].to_s + "]" else codegen " mov eax, dword ptr [rbp - " + v[1].to_s + "]" end elsif operand.kind == TK::STRING then type = "char*" label = addliteral operand.str codegen " lea rax, "+label else perror end return type end # 式のコード生成(二項演算の右側被演算子) def codegen_els(op, operand, type_l) if op.str == "+" then ostr = "add " elsif op.str == "-" then ostr = "sub " elsif op.str == "*" then ostr = "imul" elsif op.str == "/" then ostr = "idiv" elsif op.str == "%" then ostr = "idiv" elsif op.str == "==" then ostr = "cmp " elsif op.str == "!=" then ostr = "cmp " elsif op.str == "<" || op.str == "<" || op.str == ">" || op.str == "<=" || op.str == ">=" then ostr = "cmp " else perror "unknown operator \"" + op.str + "\"" end # 右被演算子を評価 type_r = "int" if operand.kind_of?(Array) then if operand[0].size == 2 && operand[0].kind == TK::ID && operand[1].str == "()" then codegen " sub rsp, 8" codegen " push rax" type_r = codegen_func operand codegen " mov r10d, eax" codegen " pop rax" codegen " add rsp, 8" else codegen " sub rsp, 8" codegen " push rax" type_r = codegen_el operand codegen " mov r10d, eax" codegen " pop rax" codegen " add rsp, 8" end str = "r10d" elsif operand.kind == TK::ID then v = get_var operand.str if v == nil then perror "undeclared variable \"" + operand.str + "\"" end type_r = v[0] if type_r == "char*" str = "qword ptr [rbp - " + v[1].to_s + "]" else str = "dword ptr [rbp - " + v[1].to_s + "]" end elsif operand.kind == TK::NUMBER then str = operand.str elsif operand.kind == TK::STRING then type_r = "char*" label = addliteral operand.str codegen " lea r10, "+label str = "r10" else perror end # 型チェック if type_l != type_r then if type_l == "char*" && type_r == "int" then if op.str != "+" && op.str != "-" then perror "mismatched types to binary operation" end elsif type_l == "int" && type_r == "char*" then if op.str != "+" && op.str != "-" then perror "mismatched types to binary operation" end else perror "mismatched types to binary operation" end elsif type_l == "char*" then perror "mismatched types to binary operation" end # 左被演算子と右被演算子とで計算 if op.str == "==" then codegen " " + ostr + " eax, " + str codegen " sete al" codegen " and eax, 1" elsif op.str == "!=" then codegen " " + ostr + " eax, " + str codegen " setne al" codegen " and eax, 1" elsif op.str == "<" then codegen " " + ostr + " eax, " + str codegen " setl al" codegen " and eax, 1" elsif op.str == ">" then codegen " " + ostr + " eax, " + str codegen " setg al" codegen " and eax, 1" elsif op.str == "<=" then codegen " " + ostr + " eax, " + str codegen " setle al" codegen " and eax, 1" elsif op.str == ">=" then codegen " " + ostr + " eax, " + str codegen " setge al" codegen " and eax, 1" elsif op.str == "*" || op.str == "/" || op.str == "%" then if str != "r10d" then codegen " mov r10d, " + str end codegen " mov r11, rdx" if op.str == "/" || op.str == "%" then codegen " cdq" end codegen " " + ostr + " r10d" if op.str == "%" then codegen " mov eax, edx" end codegen " mov rdx, r11" else if type_l == "char*" && type_r == "int" then if str == op.str then codegen " " + ostr + " rax, " + str elsif str == "r10d" then codegen " movsx r10, r10d" codegen " " + ostr + " rax, r10" else codegen " mov r10d, " + str codegen " movsx r10, r10d" codegen " " + ostr + " rax, r10" end elsif type_l == "int" && type_r == "char*" then codegen " movsx rax, eax" codegen " " + ostr + " rax, " + str type_l = "char*" else codegen " " + ostr + " eax, " + str end end return type_l end
get_varに変更。今回は修正箇所が多かったよ。修正漏れが無いと良いんだけど。
動作テスト
それじゃ行くよ。
~/myc$ myc -d m31.myc para str var a [a, =, 10] [[a, =, 10]] var b [b, =, 20] [[b, =, 20]] [1: a = %d b = %d\n] [1: a = %d b = %d\n] [a] [a] [b] [b] [[printf, (), [1: a = %d b = %d\n], [a], [b]]] [[printf, (), [1: a = %d b = %d\n], [a], [b]]] var a [a, =, 55] [[a, =, 55]] [2: a = %d b = %d\n] [2: a = %d b = %d\n] [a] [a] [b] [b] [[printf, (), [2: a = %d b = %d\n], [a], [b]]] [[printf, (), [2: a = %d b = %d\n], [a], [b]]] var c [c, =, 30] [[c, =, 30]] [3: a = %d b = %d c = %d\n] [3: a = %d b = %d c = %d\n] [a] [a] [b] [b] [c] [c] [[printf, (), [3: a = %d b = %d c = %d\n], [a], [b], [c]]] [[printf, (), [3: a = %d b = %d c = %d\n], [a], [b], [c]]] [a, =, 100] [[a, =, 100]] [4: a = %d b = %d c = %d\n] [4: a = %d b = %d c = %d\n] [a] [a] [b] [b] [c] [c] [[printf, (), [4: a = %d b = %d c = %d\n], [a], [b], [c]]] [[printf, (), [4: a = %d b = %d c = %d\n], [a], [b], [c]]] {"a"=>["int", 4], "b"=>["int", 8], "B1#a"=>["int", 12], "c"=>["int", 16]} {"puts"=>["int", ["char*"]], "main"=>["int", []]} ~/myc$ ./m31 1: a = 10 b = 20 2: a = 55 b = 20 3: a = 10 b = 20 c = 30 4: a = 100 b = 20 c = 30 ~/myc$
お、ちゃんと動いてる。相変わらず俺以外が見ても良く分からないデバッグ情報だけど、設計通りブロックの内側の変数aがB1#aになってるよ。修正箇所が多かった割にテストは不足してるなあ。一個一個丁寧にやらないと拙いよなあ。
今回できればfor文の頭でのループ変数の宣言まで行きたかったけど、途中で力尽きてしまったよ。次回はその辺頑張るよ。