しゃべれる理系: Language Processor

ラベル Language Processor の投稿を表示しています。すべての投稿を表示

2009-03-29

Absolute path / Relative path

cdコマンドの引数に対する構文解析器を作りました。

今までのcdコマンドでは絶対パスが表現できなかったり、隣接したディレクトリにしか移動できなかったりと、色々とショボイcdコマンドでした。しかしながら今回の更新により、例えば、ルートディレクトリに直接移動出来たり、パス名を連続して綴ることで隣接していないディレクトリに移動出来たりします。

ただ「これだけだと従来のcdコマンドと何も変わらいじゃないか！面白くナス！」と思ったので、言語処理系の授業で習った誤り処理も追加してみました。これにより、cdコマンドが失敗したとき、なぜ失敗したのかを教えてくれます。

関係ない話だけど、cdコマンド改造後の方がソースコードが短くなっのはビビったｗ。元々のcdコマンドがnaiveだったという原因もあるだろうけど、それ以上に再帰降下関数が最強すぎるという罠なのだろう。2単位の癖にSoC設計×2よりキツイと噂の言語処理系を、糞真面目に取り組んでおいて心の底から良かった思った瞬間。実は、言語処理系はオープン科目として受講できるみたいだけど、どう見ても死亡フラグです。本当にありがとうございました。
※ちなみに、言語処理系のノート等はここら辺にあります。参考資料としてどうぞ。

さて、メモリの離散的な確保も実装したし、cdコマンドも強化したので、後はドキュメント書くだけかな。ガリガリ書いていきます。

2008-06-05

Calculator with LP 2

I have finish the report! so next time, I'll study a backend of language processor!

but, now for a computational intelligence programming, namely "prolog"!
because its deadline is coming soon! lol

目次
1. 入力文字列の構文規則
2. 電卓を設計した過程
3. ソースプログラムの説明
3.1 字句解析
3.2 構文解析
3.3 意味解析
3.4 誤り処理
3.5 main関数
4. 実行例
5. 設計した電卓のソースプログラム

(!!CAUTAION!!:some indents were omitted)
1. 入力文字列の構文規則
以下に、今回設計した電卓プログラム(以下、本プログラム)の構文規則を示す。

expression = [ '-' | '+' ] term { ( '+' | '-' ) term }
term = primary { ( '*' | '/' | '%' ) primary }
primary = character | figure | '(' expression ')' | '[' expression ']'
character = ID [ '=' expression ]
figure = {number} [ '.' {number}]
number = 0|1|2|3|4|5|6|7|8|9
ID = 英文字(大文字、小文字含む)

ただし、ID = expression1 * expression2のような入力文字列が与えられた場合、ID = (expression1 * expression2)としてIDへの代入処理が行われる。
ID にexpression1だけを代入して、その結果とexpression2を計算させたい場合は、
( ID = expression1 ) * expression2 のように、括弧を使って明示的に示す必要がある。

2. 電卓を設計した過程
電卓を設計する上でまず始めに考えたことは、与えられた仕様を満たすように入力文字列の構文規則を設計することである。このとき構文規則は、後の具現化の作業が容易になるように、再帰降下法が適用できる形にした。こうして出来上がった構文規則が1.で示した構文規則である。
次に、具体的な字句解析の方法を考えると同時に、具現化の作業を行った。このとき、WL言語や教材ページの簡易電卓で使われている字句解析を参考にして、設計を行った。
その後、変数の登録・代入の設計方針と誤り処理の設計方針についてはひとまず置いておき、まずはここまでの仕様を具現化した。しかしながら、誤りが起こったときの処理で関数errorを呼び出して扱うことは明白だったので、誤りが構文規則以外の文字を読み取ったときには、ひとまず関数errorを呼び出して置くようにした。
代入と誤り処理以外の仕様をすべて満たす電卓が完成したら、次に代入の設計方針を固める作業を行った。代入の設計方針としては、英字列を読み取ったとき、代入文とともに呼び出される場合と、そうでない場合の２通りがあることに留意しながら設計を行った。このとき、代入文とともに呼び出される場合は、英字列と値の同期付けを行い、その結果をなんらかの方法で表に登録する必要がある。今回設計した電卓では、登録、または登録された英字列を読み出す方法として、ハッシュ法を採用した。そして、これらの設計方針を基にして、代入の具現化を行った。

最後に、誤り処理の設計を行った。しかしながら、誤り処理については具体的な設計アルゴリズムが思いつかなかったので、出来るだけjavaやCコンパイラの誤り処理に近づけるようにして設計する、という方針を採用した。具体的な設計方針としては、入力文字列のどこの部分がどのように間違えているのかを出来るだけ明確にして出力する、また、誤りが複数ある場合は、その誤りの数だけ指摘できる誤り処理がする、という設計方針を立てた。これらの設計方針に基づき、誤り処理の具現を行い、今回設計した電卓が完成した。

3. ソースプログラムの説明
本項では本プログラムのソースプログラムを解説する。具体的には、ソースプログラムを字句解析部、構文解析部、代入の処理部、誤りの処理部、そしてmain関数部の5つの部分に分けて、各部分について一つずつ解説を行っていく。

3.1 字句解析
本プログラムは、WL言語と同様に1次先読みの字句解析を行うようにしている。以下は、本プログラムで使用する1次先読みの字句解析プログラムの解説である。
字句解析を行うにあたり、まずは入力文字列を読み込む必要があるので、入力文字列を読み込み、読み込んだ文字列を配列charsに格納する関数get_line()を作成した。この関数では、入力文字列を関数getchar()を用いて、’\n’が押されるまでの間、入力文字を読み取り続けている。関数get_char()が’\n’を読み取ったとき、入力文字列が空行であればFalseを、一文字以上存在していればTrueを返している。これにより、与えられた仕様の”からの行を入力すると終了する”という部分の具現が容易になる。
次に、配列charsに格納された入力文字列を、呼び出される度に一字読み取っていく関数get_token()を作成した。具体的には、変数cpを入力文字列のポインタとして扱い、字句を読み取った後、変数cpをインクリメントしていき、一字ずつ入力文字列を読み取っている。
ただし、読み取った文字が以下のケースに該当し場合は、例外的に以下に示す処理を行う。

（条件：処理）
・空白スペース：空白スペース以外の文字が出現するまで次の文字を読み取り続ける。
・英字：英字以外の文字が出現するまで次の文字を読み取り続ける。また、読み取った文字列を一つの単語として配列strに格納する。

また、関数get_token()では、配列charsから一字読み取ると同時に、読み取った文字をカテゴリ別に分類する作業も行うようにしている。分類した結果は変数tokenに反映させている。このときの分類作業は、以下の仕様に則って行われる。

(カテゴリ名：カテゴライズされる文字)
1. Value：0から9の数字
2. Character：aからz、またはAからZ
3. Decimal：’.’
4. Lbracket：’{‘
5. Lpar：’(‘
6. Plus：’+’
7. Minus：’-’
8. Times：’*’
9. Divide：’/’
10. Modulo：’%’
11. Rpar：’)’
12. Rbracket：’}’
13. Equal：’=’
14. EOL：’\n’
15. Error：上記以外の文字、または記号

上記に示した各カテゴリは、enum宣言で列挙した。列挙する順番は、カテゴリ名の左に書かれている番号に準拠している。
本プログラムの字句解析では、読み取った英字列に対しての名前の同定は行わず、その英字列を配列strに一時的に保存するだけにとどめている。これは、次に読み取る字句によって、読み取った英字列に対して行う処理が変わるためである。よって、読み取った英字列に対しての名前の同定処理は、構文解析の部分で行うことにしている。
本プログラムの字句解析部は、これらの関数を用いて具現化されている。

3.2 構文解析
次に、字句解析で得た情報を基に構文解析をするプログラムについて解説する。構文解析プログラムは、1.で示した構文規則に再帰降下法を適用したプログラムとなっている。よって、構文規則より、関数expression(), term(), primary(), figure(), character()を生成し、これらの関数を再帰的に用いて構文解析を行っている。
関数expression(), term, primary()については、特にこだわった点は無く、構文規則に従い単純に再帰降下法を適用しただけである。一方で、関数figure(), character()については、読み取った文字列の数値化や名前の同定作業、変数の登録作業を行っている。
まずは、関数figure()について解説を行う。関数figure()が呼び出されるとき、構文規則より、変数tokenには必ずValueが入っていることになる。つまり、これから読み取る文字が、1つ以上の連続した数字列であることが保証されていることになる。よって、文字列の数値化を具現化する方法として、while文を使って読み取った文字が数字である限り、以下の関数を呼び出す方法を使った。

value= value*10+(chars[cp-1]-'0');
get_token();

一行目に書かれている文は、変数valueに10をかけて1の位を0にしたあと、読み取った文字番号から’0’の文字番号を引いた値をvalueに加算する、という文である。その後、関数get_token()を用いて次の文字を読み取る。これらの式により、入力された数字の文字列を十進整数定数化することが可能になる。
しかしながら、構文規則では数字列の直後に小数点が来ることを許している。つまり、入力された文字列が小数点を含んだ十進定数であっても、数値化できなくてはいけない。この仕様を具現化する手段としては、小数点後に続く数字の数をカウントし、文字列の数値化が終了した時点でカウントした数だけ変数valueを10で除算する、という方法を用いている。このとき、文字列を数値化する方法は十進整数定数を具現化する方法と同じ方法を用いている。また、十進整数定数化後に読み取った文字列が小数点でなければ、この処理は行われない。
これらの動作により、関数figure()は文字列の数値化を具現化している。

3.3 意味解析
次に、関数character()について解説する。関数characterには、大きく分けて二つの処理方法があり、条件文によって必ずどちらか一方の処理が実行される。一つ目の処理方法は、英字列の後に’=’が付属した場合に実行される。この処理では、英字列を一つの変数として扱い、’=’後に続く式の結果をその変数に格納している。その後、その変数をハッシュ法で撒布させ、該当するハッシュ表に関数registerKeyword()を用いて登録する。このとき、撒布後の衝突を防ぐため、ハッシュ表の各要素はリスト構造を持たせている。これにより、ハッシュ法による衝突が生じた場合でもリストを用いて変数を登録することが出来る。
もう一つの処理方法は、英字列の後に’=’が付属していない場合の処理方法である。この処理では、読み取った英字列を既にハッシュ表に登録されている変数として扱う。よって、まずはその英字列を関数FindKeyWord()を用いてハッシュ表から検索する。検索に失敗した場合は関数error()を呼び出し、成功した場合は関数popKeyValue()を用いてその英字列の持っている数値を取り出す。
これらの動作により、関数character()は、名前の同定と変数の登録を具現化している。

3.4 誤り処理
本項では、本プログラムの構文解析において、予期せぬ字句を読み取ってしまった場合の対処方法について解説する。
本プログラムでは、構文解析時に構文規則を満たさない字句を読み取ってしまった場合、関数error()を呼び出すことにしている。関数error()は３つの引数をもっており、それぞれ以下のような特徴を持っている。

(引数名：役割)
Char message[]：コンソール画面にmessageに格納されている文字列を出力する。
Int procedure：変数procedureより優先度の低い字句を読み飛ばす。
Int position：構文規則を満たしていない文字の場所を変数positionで示す。

上記に示したように各引数はそれぞれ、誤りを是正するために必要な情報を出来るだけ詳細に示す役割を持っている。以下に、各引数が関数error()のどのような振る舞いと関係があるのかを具体的に解説する。
配列messageでは、エラーの内容を出来るだけ詳細に教えてくれる役割を持っている。例えば、左括弧’(‘を事前に入力文字列から読み取っていたが、それらに対応する’)’が入力文字列に含まれていなかった場合は、引数の配列messageに”missing ‘(‘”を入れることで、画面にエラーの原因が出力することが出来る。

変数procedureは、入力文字列に誤りが複数あったとき、その誤りを出来る限り多く検出する役割を持っている。3.1の字句解析で、分類されるカテゴリはEnum宣言で列挙されることを示した。これを利用して、Enum宣言で列挙するカテゴリの順番を、出現する頻度や演算子同士の優先度を考慮して配置することで、次に来るべき字句を特定することが出来る。これにより、’+’や’*’などの演算子を使うべき部分で何らかの誤りがあった場合でも、Valueや括弧が出てくるまで入力文字列を読み飛ばして再び構文解析を開始することが出来る。つまり、変数procedureで適宜読み飛ばす字句を指定してあげることで、複数のerrorを一度に検出することが出来ることになる。
変数positionは、入力文字列のどこの文字がerrorの原因になったかを特定する役割を持っている。文字の場所を特定する方法として、入力文字列ポインタ用の変数cpを使っている。なぜなら、関数error()は構文解析中に呼び出され、かつ変数cpは常に次の文字を指しているので、chars[cp-1]の場所にある文字がerrorの発生した文字と一致するからである。しかしながら、3.1字句解析の解説で示したように、英字を読み取った場合は例外的な処理を行っているため、errorの発生する場所が常にcp-1になるとは限らない。よって、この場合に限っては、現在cpの指している位置によって変数positionの値を変更し、変数positionの値が常に英字列の最後尾を指すように設計した。
この関数error()を用いて、本プログラムでは誤りの原因の特定を行っている。

3.5 main関数
最後に、本プログラムのmain関数について解説する。main関数では、まず入力文字列を格納する配列charsの初期化を行い、その後、関数get_line()を呼び出している。3.1の字句解析で解説したように、関数get_line()は空行が入力されたかどうかの結果を返すことにしている。これを利用して、while文の条件式を”while(get_line() != FALSE)”とすることで、空行が入力されたときに終了するプログラムにすることできる。
次に、while文の内側で実行される処理を解説する。関数get_line()で何らかの入力文字列を読み取ったら、関数get_token()を呼び出して字句解析を一度だけ行う。これは一字先読みのルールを遵守するためである。その後、構文解析プログラムを用いて入力文字列の計算を行い、その結果を変数resultに格納する。その後、構文解析中にerrorが発生していなければresultの内容を出力し、errorが1個以上発生していればerror発生したことを画面に出力している。しかしながら、誤り処理の関数error()も完璧に隙が無いわけではない。このため、誤りを検知出来なかった場合の例外も想定して、誤り検出に失敗したときは”unexpected error”を画面に出力するようにしている。その後、配列charsを再度初期化して、再びwhile文の先頭に戻る。
While文を抜けたら、ハッシュ法を使用する際に確保したメモリを解放し、プログラムを終了させる。

4. 実行例
yohei@YOHEI-note ~/wl/calc
$ gcc calc.c -o calc
yohei@YOHEI-note ~/wl/calc
$ ./calc

----- 小数点、各演算子、括弧、代入、終了の確認 -----
# +2-1
>> 1.000000

# -2-1
>> -3.000000

# 2+5*10
>> 52.000000

# 2*5+10
>> 20.000000

# 2*(5+10)
>> 30.000000

# 3/2
>> 1.500000

# 5.5/2.2
>> 2.500000

# [5.5/2.2]
>> 2.000000

# [5/(1+1)]
>> 2.000000

# 2/1+6%3
>> 2.000000

# 4.9%0.3
>> 0.100000

# two=2
two holds 2.000000
>> 2.000000

# five=5
five holds 5.000000
>> 5.000000

# ten=10
ten holds 10.000000
>> 10.000000

# hundred=100
hundred holds 100.000000
>> 100.000000

# -(ten+five/two)*two*hundred
>> -2500.000000

#
shut down this program...
yohei@YOHEI-note ~/wl/calc
$

----- 誤り処理 -----
# 1+ 2++ 4
error: incorrect position of symbol.
1+ 2++ 4
^
1 error was found.

# (1+3*4
error: missing ')'
(1+3*4
^
1 error was found.

# 1++[2*3**1
error: incorrect position of symbol.
1++[2*3**1
^
error: incorrect position of symbol.
1++[2*3**1
^
error: missing ']'
1++[2*3**1
^
3 errors were found.

# two+five*second+ten
error: not registered word is used.
two+five*second+ten
^
1 error was found.

# 1.3.4.5.6
unexpected error

5. 設計した電卓のソースプログラム
* see the following page.
https://github.com/yasulab/calculator-with-lp

2008-06-03

calculator with LP

I finally have seen the goal of this programming!, namely making a calculator program using knowledge of language processor I have studied.
However, this program has still a lot of errors, so I will brush up tomorrow.
This program is a calculator which can run along this specification below.

1. can use operands: +, -, *, /, %
2. can use Gauss's integer using branckets
3. can change priority of operation using parenthsis
4. can assign value to a valuable structured by alphabets

The specification is implemented by this syntax rule below.

expression = ['-'] term {('+'|'-') term }
term = primary {('*'|'/'|'%') primary }
primary = character | figure | '(' expression ')' | '[' expression ']'
character = ID | ID '=' expression
figure = {number} ['.' {number}]
number = 0|1|2|3|4|5|6|7|8|9
ID = a|b|...|y|z|A|B|...|Y|Z

Finally, I have implemented it so far, as this draft program below.
I feel I will be able to finish it including report!

* To see the source of "calculator.c", please visit the following page.
https://github.com/yasulab/calculator-with-lp

2008-05-25

Lex Performance

This table above is comparision between token analyzers at file size of source code and object code. One is programmed by Lex, and the other is programmed by me. Both source code runs same way.

The table shows us that Lex has high performance of optimization, because the difference of file size between th token analyzers is only 500 bytes. How def lex is lol. I programmed the token analyzer in a few days, however lex producted them in just a moment! What a good job lex did.

/* A lexical scanner generated by flex */

/* Scanner skeleton version:
* $Header: /home/daffy/u0/vern/flex/RCS/flex.skl,v 2.91 96/09/10 16:58:48 vern Exp $
*/

#define FLEX_SCANNER
#define YY_FLEX_MAJOR_VERSION 2
#define YY_FLEX_MINOR_VERSION 5

#include <stdio.h>
#include <errno.h>

/* cfront 1.2 defines "c_plusplus" instead of "__cplusplus" */
#ifdef c_plusplus
#ifndef __cplusplus
#define __cplusplus
#endif
#endif

#ifdef __cplusplus

#include <stdlib.h>
#ifndef _WIN32
#include <unistd.h>
#endif

/* Use prototypes in function declarations. */
#define YY_USE_PROTOS

/* The "const" storage-class-modifier is valid. */
#define YY_USE_CONST

#else /* ! __cplusplus */

#if __STDC__

#define YY_USE_PROTOS
#define YY_USE_CONST

#endif /* __STDC__ */
#endif /* ! __cplusplus */

#ifdef __TURBOC__
#pragma warn -rch
#pragma warn -use
#include <io.h>
#include <stdlib.h>
#define YY_USE_CONST
#define YY_USE_PROTOS
#endif

#ifdef YY_USE_CONST
#define yyconst const
#else
#define yyconst
#endif

#ifdef YY_USE_PROTOS
#define YY_PROTO(proto) proto
#else
#define YY_PROTO(proto) ()
#endif

/* Returned upon end-of-file. */
#define YY_NULL 0

/* Promotes a possibly negative, possibly signed char to an unsigned
* integer for use as an array index. If the signed char is negative,
* we want to instead treat it as an 8-bit unsigned char, hence the
* double cast.
*/
#define YY_SC_TO_UI(c) ((unsigned int) (unsigned char) c)

/* Enter a start condition. This macro really ought to take a parameter,
* but we do it the disgusting crufty way forced on us by the ()-less
* definition of BEGIN.
*/
#define BEGIN yy_start = 1 + 2 *

/* Translate the current start state into a value that can be later handed
* to BEGIN to return to the state. The YYSTATE alias is for lex
* compatibility.
*/
#define YY_START ((yy_start - 1) / 2)
#define YYSTATE YY_START

/* Action number for EOF rule of a given start state. */
#define YY_STATE_EOF(state) (YY_END_OF_BUFFER + state + 1)

/* Special action meaning "start processing a new file". */
#define YY_NEW_FILE yyrestart( yyin )

#define YY_END_OF_BUFFER_CHAR 0

/* Size of default input buffer. */
#define YY_BUF_SIZE 16384

typedef struct yy_buffer_state *YY_BUFFER_STATE;

extern int yyleng;
extern FILE *yyin, *yyout;

#define EOB_ACT_CONTINUE_SCAN 0
#define EOB_ACT_END_OF_FILE 1
#define EOB_ACT_LAST_MATCH 2

/* The funky do-while in the following #define is used to turn the definition
* int a single C statement (which needs a semi-colon terminator). This
* avoids problems with code like:
*
* if ( condition_holds )
* yyless( 5 );
* else
* do_something_else();
*
* Prior to using the do-while the compiler would get upset at the
* "else" because it interpreted the "if" statement as being all
* done when it reached the ';' after the yyless() call.
*/

/* Return all but the first 'n' matched characters back to the input stream. */

#define yyless(n) \
do \
{ \
/* Undo effects of setting up yytext. */ \
*yy_cp = yy_hold_char; \
YY_RESTORE_YY_MORE_OFFSET \
yy_c_buf_p = yy_cp = yy_bp + n - YY_MORE_ADJ; \
YY_DO_BEFORE_ACTION; /* set up yytext again */ \
} \
while ( 0 )

#define unput(c) yyunput( c, yytext_ptr )

/* The following is because we cannot portably get our hands on size_t
* (without autoconf's help, which isn't available because we want
* flex-generated scanners to compile on their own).
*/
typedef unsigned int yy_size_t;

struct yy_buffer_state
{
FILE *yy_input_file;

char *yy_ch_buf; /* input buffer */
char *yy_buf_pos; /* current position in input buffer */

/* Size of input buffer in bytes, not including room for EOB
* characters.
*/
yy_size_t yy_buf_size;

/* Number of characters read into yy_ch_buf, not including EOB
* characters.
*/
int yy_n_chars;

/* Whether we "own" the buffer - i.e., we know we created it,
* and can realloc() it to grow it, and should free() it to
* delete it.
*/
int yy_is_our_buffer;

/* Whether this is an "interactive" input source; if so, and
* if we're using stdio for input, then we want to use getc()
* instead of fread(), to make sure we stop fetching input after
* each newline.
*/
int yy_is_interactive;

/* Whether we're considered to be at the beginning of a line.
* If so, '^' rules will be active on the next match, otherwise
* not.
*/
int yy_at_bol;

/* Whether to try to fill the input buffer when we reach the
* end of it.
*/
int yy_fill_buffer;

int yy_buffer_status;
#define YY_BUFFER_NEW 0
#define YY_BUFFER_NORMAL 1
/* When an EOF's been seen but there's still some text to process
* then we mark the buffer as YY_EOF_PENDING, to indicate that we
* shouldn't try reading from the input source any more. We might
* still have a bunch of tokens to match, though, because of
* possible backing-up.
*
* When we actually see the EOF, we change the status to "new"
* (via yyrestart()), so that the user can continue scanning by
* just pointing yyin at a new input file.
*/
#define YY_BUFFER_EOF_PENDING 2
};

static YY_BUFFER_STATE yy_current_buffer = 0;

/* We provide macros for accessing buffer states in case in the
* future we want to put the buffer states in a more general
* "scanner state".
*/
#define YY_CURRENT_BUFFER yy_current_buffer

/* yy_hold_char holds the character lost when yytext is formed. */
static char yy_hold_char;

static int yy_n_chars; /* number of characters read into yy_ch_buf */

int yyleng;

/* Points to current character in buffer. */
static char *yy_c_buf_p = (char *) 0;
static int yy_init = 1; /* whether we need to initialize */
static int yy_start = 0; /* start state number */

/* Flag which is used to allow yywrap()'s to do buffer switches
* instead of setting up a fresh yyin. A bit of a hack ...
*/
static int yy_did_buffer_switch_on_eof;

void yyrestart YY_PROTO(( FILE *input_file ));

void yy_switch_to_buffer YY_PROTO(( YY_BUFFER_STATE new_buffer ));
void yy_load_buffer_state YY_PROTO(( void ));
YY_BUFFER_STATE yy_create_buffer YY_PROTO(( FILE *file, int size ));
void yy_delete_buffer YY_PROTO(( YY_BUFFER_STATE b ));
void yy_init_buffer YY_PROTO(( YY_BUFFER_STATE b, FILE *file ));
void yy_flush_buffer YY_PROTO(( YY_BUFFER_STATE b ));
#define YY_FLUSH_BUFFER yy_flush_buffer( yy_current_buffer )

YY_BUFFER_STATE yy_scan_buffer YY_PROTO(( char *base, yy_size_t size ));
YY_BUFFER_STATE yy_scan_string YY_PROTO(( yyconst char *yy_str ));
YY_BUFFER_STATE yy_scan_bytes YY_PROTO(( yyconst char *bytes, int len ));

static void *yy_flex_alloc YY_PROTO(( yy_size_t ));
static void *yy_flex_realloc YY_PROTO(( void *, yy_size_t ));
static void yy_flex_free YY_PROTO(( void * ));

#define yy_new_buffer yy_create_buffer

#define yy_set_interactive(is_interactive) \
{ \
if ( ! yy_current_buffer ) \
yy_current_buffer = yy_create_buffer( yyin, YY_BUF_SIZE ); \
yy_current_buffer->yy_is_interactive = is_interactive; \
}

#define yy_set_bol(at_bol) \
{ \
if ( ! yy_current_buffer ) \
yy_current_buffer = yy_create_buffer( yyin, YY_BUF_SIZE ); \
yy_current_buffer->yy_at_bol = at_bol; \
}

#define YY_AT_BOL() (yy_current_buffer->yy_at_bol)

typedef unsigned char YY_CHAR;
FILE *yyin = (FILE *) 0, *yyout = (FILE *) 0;
typedef int yy_state_type;
extern char *yytext;
#define yytext_ptr yytext

static yy_state_type yy_get_previous_state YY_PROTO(( void ));
static yy_state_type yy_try_NUL_trans YY_PROTO(( yy_state_type current_state ));
static int yy_get_next_buffer YY_PROTO(( void ));
static void yy_fatal_error YY_PROTO(( yyconst char msg[] ));

/* Done after the current pattern has been matched and before the
* corresponding action - sets up yytext.
*/
#define YY_DO_BEFORE_ACTION \
yytext_ptr = yy_bp; \
yyleng = (int) (yy_cp - yy_bp); \
yy_hold_char = *yy_cp; \
*yy_cp = '\0'; \
yy_c_buf_p = yy_cp;

#define YY_NUM_RULES 18
#define YY_END_OF_BUFFER 19
static yyconst short int yy_accept[41] =
{ 0,
0, 0, 19, 17, 16, 8, 17, 10, 8, 17,
15, 9, 10, 5, 4, 11, 14, 11, 7, 17,
12, 0, 2, 0, 13, 0, 0, 0, 5, 0,
4, 11, 7, 3, 0, 0, 0, 6, 1, 0
} ;

static yyconst int yy_ec[256] =
{ 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 2,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 3, 4, 1, 1, 5, 6, 7, 8,
9, 10, 11, 12, 13, 1, 14, 15, 16, 16,
16, 16, 16, 16, 16, 17, 17, 1, 18, 19,
20, 21, 1, 1, 22, 22, 22, 22, 22, 22,
23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
23, 23, 23, 23, 23, 23, 23, 24, 23, 23,
25, 26, 27, 1, 28, 1, 29, 29, 29, 29,

29, 29, 23, 23, 23, 23, 23, 23, 23, 23,
23, 23, 23, 23, 23, 23, 23, 23, 23, 30,
23, 23, 31, 32, 33, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1
} ;

static yyconst int yy_meta[34] =
{ 0,
1, 2, 1, 1, 1, 1, 3, 1, 1, 1,
1, 1, 1, 1, 4, 4, 4, 1, 1, 1,
1, 4, 5, 5, 1, 1, 1, 5, 4, 5,
1, 1, 1
} ;

static yyconst short int yy_base[46] =
{ 0,
0, 0, 85, 86, 86, 64, 30, 86, 77, 56,
86, 86, 71, 20, 22, 60, 50, 49, 0, 32,
86, 36, 86, 58, 86, 38, 39, 33, 33, 0,
36, 86, 0, 86, 44, 45, 47, 0, 86, 86,
62, 67, 69, 74, 38
} ;

static yyconst short int yy_def[46] =
{ 0,
40, 1, 40, 40, 40, 40, 41, 40, 40, 42,
40, 40, 40, 40, 40, 40, 40, 40, 43, 40,
40, 41, 40, 41, 40, 40, 42, 44, 40, 45,
40, 40, 43, 40, 44, 44, 44, 45, 40, 0,
40, 40, 40, 40, 40
} ;

static yyconst short int yy_nxt[120] =
{ 0,
4, 5, 6, 7, 8, 9, 10, 11, 11, 8,
12, 11, 12, 13, 14, 15, 15, 11, 16, 17,
18, 19, 19, 19, 11, 4, 11, 19, 19, 19,
11, 20, 11, 23, 29, 29, 31, 31, 31, 23,
26, 38, 36, 30, 34, 26, 37, 29, 29, 30,
31, 31, 31, 36, 36, 24, 40, 37, 39, 22,
40, 24, 22, 25, 22, 22, 22, 26, 32, 21,
26, 26, 33, 33, 35, 35, 35, 35, 35, 32,
28, 27, 25, 21, 40, 3, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40,

40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40
} ;

static yyconst short int yy_chk[120] =
{ 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 7, 14, 14, 15, 15, 15, 22,
27, 45, 28, 14, 26, 27, 28, 29, 29, 14,
31, 31, 31, 35, 36, 7, 37, 35, 36, 24,
37, 22, 41, 20, 41, 41, 41, 42, 18, 17,
42, 42, 43, 43, 44, 44, 44, 44, 44, 16,
13, 10, 9, 6, 3, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40,

40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40
} ;

static yy_state_type yy_last_accepting_state;
static char *yy_last_accepting_cpos;

/* The intent behind this definition is that it'll catch
* any uses of REJECT which flex missed.
*/
#define REJECT reject_used_but_not_detected
#define yymore() yymore_used_but_not_detected
#define YY_MORE_ADJ 0
#define YY_RESTORE_YY_MORE_OFFSET
char *yytext;
#line 1 "flex2.lex"
#define INITIAL 0
/* scanner for a toy wl-like language */
#line 4 "flex2.lex"
/* need this for the call to atoi() below */
#include<stdlib.h>
#include<string.h>
#include <stdio.h>

#define HASHSIZE 40 /* size of hash table */
#define MAX_KW_LEN 11 /* max length of keyword */
#define NUM_KW 6 /* number of keyword */

#define TRUE 1
#define FALSE 0

struct list {
char keyword[MAX_KW_LEN];
struct list *next; /* next list pointer */
};

struct list *hashtable[HASHSIZE]; /* hash table */

int Hash(char *key); /* return hash value of HASHSIZE from zero */
void InitHTable(void); /* register keyword in hash table */
int FindKeyWord(char *key); /* try to find out keyword in hashtable */
void FreeKeyWord(void); /* flee alocated memory */

#line 430 "lex.yy.c"

/* Macros after this point can all be overridden by user definitions in
* section 1.
*/

#ifndef YY_SKIP_YYWRAP
#ifdef __cplusplus
extern "C" int yywrap YY_PROTO(( void ));
#else
extern int yywrap YY_PROTO(( void ));
#endif
#endif

#ifndef YY_NO_UNPUT
static void yyunput YY_PROTO(( int c, char *buf_ptr ));
#endif

#ifndef yytext_ptr
static void yy_flex_strncpy YY_PROTO(( char *, yyconst char *, int ));
#endif

#ifdef YY_NEED_STRLEN
static int yy_flex_strlen YY_PROTO(( yyconst char * ));
#endif

#ifndef YY_NO_INPUT
#ifdef __cplusplus
static int yyinput YY_PROTO(( void ));
#else
static int input YY_PROTO(( void ));
#endif
#endif

#if YY_STACK_USED
static int yy_start_stack_ptr = 0;
static int yy_start_stack_depth = 0;
static int *yy_start_stack = 0;
#ifndef YY_NO_PUSH_STATE
static void yy_push_state YY_PROTO(( int new_state ));
#endif
#ifndef YY_NO_POP_STATE
static void yy_pop_state YY_PROTO(( void ));
#endif
#ifndef YY_NO_TOP_STATE
static int yy_top_state YY_PROTO(( void ));
#endif

#else
#define YY_NO_PUSH_STATE 1
#define YY_NO_POP_STATE 1
#define YY_NO_TOP_STATE 1
#endif

#ifdef YY_MALLOC_DECL
YY_MALLOC_DECL
#else
#if __STDC__
#ifndef __cplusplus
#include <stdlib.h>
#endif
#else
/* Just try to get by without declaring the routines. This will fail
* miserably on non-ANSI systems for which sizeof(size_t) != sizeof(int)
* or sizeof(void*) != sizeof(int).
*/
#endif
#endif

/* Amount of stuff to slurp up with each read. */
#ifndef YY_READ_BUF_SIZE
#define YY_READ_BUF_SIZE 8192
#endif

/* Copy whatever the last rule matched to the standard output. */

#ifndef ECHO
/* This used to be an fputs(), but since the string might contain NUL's,
* we now use fwrite().
*/
#define ECHO (void) fwrite( yytext, yyleng, 1, yyout )
#endif

/* Gets input and stuffs it into "buf". number of characters read, or YY_NULL,
* is returned in "result".
*/
#ifndef YY_INPUT
#define YY_INPUT(buf,result,max_size) \
if ( yy_current_buffer->yy_is_interactive ) \
{ \
int c = '*', n; \
for ( n = 0; n < max_size && \
(c = getc( yyin )) != EOF && c != '\n'; ++n ) \
buf[n] = (char) c; \
if ( c == '\n' ) \
buf[n++] = (char) c; \
if ( c == EOF && ferror( yyin ) ) \
YY_FATAL_ERROR( "input in flex scanner failed" ); \
result = n; \
} \
else \
{ \
errno=0; \
while ( (result = fread(buf, 1, max_size, yyin))==0 && ferror(yyin)) \
{ \
if( errno != EINTR) \
{ \
YY_FATAL_ERROR( "input in flex scanner failed" ); \
break; \
} \
errno=0; \
clearerr(yyin); \
} \
}
#endif

/* No semi-colon after return; correct usage is to write "yyterminate();" -
* we don't want an extra ';' after the "return" because that will cause
* some compilers to complain about unreachable statements.
*/
#ifndef yyterminate
#define yyterminate() return YY_NULL
#endif

/* Number of entries by which start-condition stack grows. */
#ifndef YY_START_STACK_INCR
#define YY_START_STACK_INCR 25
#endif

/* Report a fatal error. */
#ifndef YY_FATAL_ERROR
#define YY_FATAL_ERROR(msg) yy_fatal_error( msg )
#endif

/* Default declaration of generated scanner - a define so the user can
* easily add parameters.
*/
#ifndef YY_DECL
#define YY_DECL int yylex YY_PROTO(( void ))
#endif

/* Code executed at the beginning of each rule, after yytext and yyleng
* have been set up.
*/
#ifndef YY_USER_ACTION
#define YY_USER_ACTION
#endif

/* Code executed at the end of each rule. */
#ifndef YY_BREAK
#define YY_BREAK break;
#endif

#define YY_RULE_SETUP \
YY_USER_ACTION

YY_DECL
{
register yy_state_type yy_current_state;
register char *yy_cp, *yy_bp;
register int yy_act;

#line 49 "flex2.lex"

#line 595 "lex.yy.c"

if ( yy_init )
{
yy_init = 0;

#ifdef YY_USER_INIT
YY_USER_INIT;
#endif

if ( ! yy_start )
yy_start = 1; /* first start state */

if ( ! yyin )
yyin = stdin;

if ( ! yyout )
yyout = stdout;

if ( ! yy_current_buffer )
yy_current_buffer =
yy_create_buffer( yyin, YY_BUF_SIZE );

yy_load_buffer_state();
}

while ( 1 ) /* loops until end-of-file is reached */
{
yy_cp = yy_c_buf_p;

/* Support of yytext. */
*yy_cp = yy_hold_char;

/* yy_bp points to the position in yy_ch_buf of the start of
* the current run.
*/
yy_bp = yy_cp;

yy_current_state = yy_start;
yy_match:
do
{
register YY_CHAR yy_c = yy_ec[YY_SC_TO_UI(*yy_cp)];
if ( yy_accept[yy_current_state] )
{
yy_last_accepting_state = yy_current_state;
yy_last_accepting_cpos = yy_cp;
}
while ( yy_chk[yy_base[yy_current_state] + yy_c] != yy_current_state )
{
yy_current_state = (int) yy_def[yy_current_state];
if ( yy_current_state >= 41 )
yy_c = yy_meta[(unsigned int) yy_c];
}
yy_current_state = yy_nxt[yy_base[yy_current_state] + (unsigned int) yy_c];
++yy_cp;
}
while ( yy_base[yy_current_state] != 86 );

yy_find_action:
yy_act = yy_accept[yy_current_state];
if ( yy_act == 0 )
{ /* have to back up */
yy_cp = yy_last_accepting_cpos;
yy_current_state = yy_last_accepting_state;
yy_act = yy_accept[yy_current_state];
}

YY_DO_BEFORE_ACTION;

do_action: /* This label is used only to access EOF actions. */

switch ( yy_act )
{ /* beginning of action switch */
case 0: /* must back up */
/* undo the effects of YY_DO_BEFORE_ACTION */
*yy_cp = yy_hold_char;
yy_cp = yy_last_accepting_cpos;
yy_current_state = yy_last_accepting_state;
goto yy_find_action;

case 1:
YY_RULE_SETUP
#line 51 "flex2.lex"
{
printf("comment\n");
}
YY_BREAK
case 2:
YY_RULE_SETUP
#line 55 "flex2.lex"
{
int i=0, len;
len = strlen(yytext);
yytext[--len] = '\0';
memmove(yytext, yytext+1,len--);
printf("string: ");
for(;i<len;i++){
if(*(yytext+i)=='\\'){
i++;
switch(*(yytext+i)){
case 'b': printf("\b"); break;
case 't': printf("\t"); break;
case 'n': printf("\n"); break;
case 'v': printf("\v"); break;
case 'f': printf("\f"); break;
case 'r': printf("\r"); break;
}
}else putc(*(yytext+i), stdout);
}
printf("\n");
}
YY_BREAK
case 3:
YY_RULE_SETUP
#line 77 "flex2.lex"
{
int i=0, len;
len = strlen(yytext);
yytext[--len] = '\0';
memmove(yytext, yytext+1,len--);
printf("character: ");
if(*(yytext+i)=='\\'){
i++;
switch(*(yytext+i)){
case 'b': printf("8\n"); break;
case 't': printf("9\n"); break;
case 'n': printf("10\n"); break;
case 'v': printf("11\n"); break;
case 'f': printf("12\n"); break;
case 'r': printf("13\n"); break;
case '0': printf("0\n"); break;
default: printf("%d\n",yytext[i]);
}
}else{
printf("%d\n",yytext[0]);
}
}
YY_BREAK
case 4:
YY_RULE_SETUP
#line 100 "flex2.lex"
{
printf("integer: %d\n", atoi(yytext));
}
YY_BREAK
case 5:
YY_RULE_SETUP
#line 105 "flex2.lex"
{
printf("integer: %d\n", strtol(yytext,NULL,8));
}
YY_BREAK
case 6:
YY_RULE_SETUP
#line 110 "flex2.lex"
{
printf("integer: %d\n", strtol(yytext,NULL,16));
}
YY_BREAK
case 7:
YY_RULE_SETUP
#line 114 "flex2.lex"
{
if((FindKeyWord(yytext)) == TRUE){
printf("%s\n", yytext);
}else{
printf("Name: %s\n",yytext);
}
}
YY_BREAK
case 8:
YY_RULE_SETUP
#line 122 "flex2.lex"
{
printf("%s\n",yytext);
}
YY_BREAK
case 9:
YY_RULE_SETUP
#line 126 "flex2.lex"
{
printf("%s\n",yytext);
}
YY_BREAK
case 10:
YY_RULE_SETUP
#line 131 "flex2.lex"
{
printf("%s\n",yytext);
}
YY_BREAK
case 11:
YY_RULE_SETUP
#line 136 "flex2.lex"
{
printf("%s\n",yytext);
}
YY_BREAK
case 12:
YY_RULE_SETUP
#line 140 "flex2.lex"
{
printf("%s\n",yytext);
}
YY_BREAK
case 13:
YY_RULE_SETUP
#line 145 "flex2.lex"
{
printf("%s\n",yytext);
}
YY_BREAK
case 14:
YY_RULE_SETUP
#line 149 "flex2.lex"
{
printf("%s\n",yytext);
}
YY_BREAK
case 15:
YY_RULE_SETUP
#line 153 "flex2.lex"
{
printf("%s\n",yytext);
}
YY_BREAK
case 16:
YY_RULE_SETUP
#line 159 "flex2.lex"
;/* ignore a new-line */
YY_BREAK
case 17:
YY_RULE_SETUP
#line 160 "flex2.lex"
;/* ignore them */
YY_BREAK
case 18:
YY_RULE_SETUP
#line 163 "flex2.lex"
ECHO;
YY_BREAK
#line 839 "lex.yy.c"
case YY_STATE_EOF(INITIAL):
yyterminate();

case YY_END_OF_BUFFER:
{
/* Amount of text matched not including the EOB char. */
int yy_amount_of_matched_text = (int) (yy_cp - yytext_ptr) - 1;

/* Undo the effects of YY_DO_BEFORE_ACTION. */
*yy_cp = yy_hold_char;
YY_RESTORE_YY_MORE_OFFSET

if ( yy_current_buffer->yy_buffer_status == YY_BUFFER_NEW )
{
/* We're scanning a new file or input source. It's
* possible that this happened because the user
* just pointed yyin at a new source and called
* yylex(). If so, then we have to assure
* consistency between yy_current_buffer and our
* globals. Here is the right place to do so, because
* this is the first action (other than possibly a
* back-up) that will match for the new input source.
*/
yy_n_chars = yy_current_buffer->yy_n_chars;
yy_current_buffer->yy_input_file = yyin;
yy_current_buffer->yy_buffer_status = YY_BUFFER_NORMAL;
}

/* Note that here we test for yy_c_buf_p "<=" to the position
* of the first EOB in the buffer, since yy_c_buf_p will
* already have been incremented past the NUL character
* (since all states make transitions on EOB to the
* end-of-buffer state). Contrast this with the test
* in input().
*/
if ( yy_c_buf_p <= &yy_current_buffer->yy_ch_buf[yy_n_chars] )
{ /* This was really a NUL. */
yy_state_type yy_next_state;

yy_c_buf_p = yytext_ptr + yy_amount_of_matched_text;

yy_current_state = yy_get_previous_state();

/* Okay, we're now positioned to make the NUL
* transition. We couldn't have
* yy_get_previous_state() go ahead and do it
* for us because it doesn't know how to deal
* with the possibility of jamming (and we don't
* want to build jamming into it because then it
* will run more slowly).
*/

yy_next_state = yy_try_NUL_trans( yy_current_state );

yy_bp = yytext_ptr + YY_MORE_ADJ;

if ( yy_next_state )
{
/* Consume the NUL. */
yy_cp = ++yy_c_buf_p;
yy_current_state = yy_next_state;
goto yy_match;
}

else
{
yy_cp = yy_c_buf_p;
goto yy_find_action;
}
}

else switch ( yy_get_next_buffer() )
{
case EOB_ACT_END_OF_FILE:
{
yy_did_buffer_switch_on_eof = 0;

if ( yywrap() )
{
/* Note: because we've taken care in
* yy_get_next_buffer() to have set up
* yytext, we can now set up
* yy_c_buf_p so that if some total
* hoser (like flex itself) wants to
* call the scanner after we return the
* YY_NULL, it'll still work - another
* YY_NULL will get returned.
*/
yy_c_buf_p = yytext_ptr + YY_MORE_ADJ;

yy_act = YY_STATE_EOF(YY_START);
goto do_action;
}

else
{
if ( ! yy_did_buffer_switch_on_eof )
YY_NEW_FILE;
}
break;
}

case EOB_ACT_CONTINUE_SCAN:
yy_c_buf_p =
yytext_ptr + yy_amount_of_matched_text;

yy_current_state = yy_get_previous_state();

yy_cp = yy_c_buf_p;
yy_bp = yytext_ptr + YY_MORE_ADJ;
goto yy_match;

case EOB_ACT_LAST_MATCH:
yy_c_buf_p =
&yy_current_buffer->yy_ch_buf[yy_n_chars];

yy_current_state = yy_get_previous_state();

yy_cp = yy_c_buf_p;
yy_bp = yytext_ptr + YY_MORE_ADJ;
goto yy_find_action;
}
break;
}

default:
YY_FATAL_ERROR(
"fatal flex scanner internal error--no action found" );
} /* end of action switch */
} /* end of scanning one token */
} /* end of yylex */

/* yy_get_next_buffer - try to read in a new buffer
*
* Returns a code representing an action:
* EOB_ACT_LAST_MATCH -
* EOB_ACT_CONTINUE_SCAN - continue scanning from current position
* EOB_ACT_END_OF_FILE - end of file
*/

static int yy_get_next_buffer()
{
register char *dest = yy_current_buffer->yy_ch_buf;
register char *source = yytext_ptr;
register int number_to_move, i;
int ret_val;

if ( yy_c_buf_p > &yy_current_buffer->yy_ch_buf[yy_n_chars + 1] )
YY_FATAL_ERROR(
"fatal flex scanner internal error--end of buffer missed" );

if ( yy_current_buffer->yy_fill_buffer == 0 )
{ /* Don't try to fill the buffer, so this is an EOF. */
if ( yy_c_buf_p - yytext_ptr - YY_MORE_ADJ == 1 )
{
/* We matched a single character, the EOB, so
* treat this as a final EOF.
*/
return EOB_ACT_END_OF_FILE;
}

else
{
/* We matched some text prior to the EOB, first
* process it.
*/
return EOB_ACT_LAST_MATCH;
}
}

/* Try to read more data. */

/* First move last chars to start of buffer. */
number_to_move = (int) (yy_c_buf_p - yytext_ptr) - 1;

for ( i = 0; i < number_to_move; ++i )
*(dest++) = *(source++);

if ( yy_current_buffer->yy_buffer_status == YY_BUFFER_EOF_PENDING )
/* don't do the read, it's not guaranteed to return an EOF,
* just force an EOF
*/
yy_current_buffer->yy_n_chars = yy_n_chars = 0;

else
{
int num_to_read =
yy_current_buffer->yy_buf_size - number_to_move - 1;

while ( num_to_read <= 0 )
{ /* Not enough room in the buffer - grow it. */
#ifdef YY_USES_REJECT
YY_FATAL_ERROR(
"input buffer overflow, can't enlarge buffer because scanner uses REJECT" );
#else

/* just a shorter name for the current buffer */
YY_BUFFER_STATE b = yy_current_buffer;

int yy_c_buf_p_offset =
(int) (yy_c_buf_p - b->yy_ch_buf);

if ( b->yy_is_our_buffer )
{
int new_size = b->yy_buf_size * 2;

if ( new_size <= 0 )
b->yy_buf_size += b->yy_buf_size / 8;
else
b->yy_buf_size *= 2;

b->yy_ch_buf = (char *)
/* Include room in for 2 EOB chars. */
yy_flex_realloc( (void *) b->yy_ch_buf,
b->yy_buf_size + 2 );
}
else
/* Can't grow it, we don't own it. */
b->yy_ch_buf = 0;

if ( ! b->yy_ch_buf )
YY_FATAL_ERROR(
"fatal error - scanner input buffer overflow" );

yy_c_buf_p = &b->yy_ch_buf[yy_c_buf_p_offset];

num_to_read = yy_current_buffer->yy_buf_size -
number_to_move - 1;
#endif
}

if ( num_to_read > YY_READ_BUF_SIZE )
num_to_read = YY_READ_BUF_SIZE;

/* Read in more data. */
YY_INPUT( (&yy_current_buffer->yy_ch_buf[number_to_move]),
yy_n_chars, num_to_read );

yy_current_buffer->yy_n_chars = yy_n_chars;
}

if ( yy_n_chars == 0 )
{
if ( number_to_move == YY_MORE_ADJ )
{
ret_val = EOB_ACT_END_OF_FILE;
yyrestart( yyin );
}

else
{
ret_val = EOB_ACT_LAST_MATCH;
yy_current_buffer->yy_buffer_status =
YY_BUFFER_EOF_PENDING;
}
}

else
ret_val = EOB_ACT_CONTINUE_SCAN;

yy_n_chars += number_to_move;
yy_current_buffer->yy_ch_buf[yy_n_chars] = YY_END_OF_BUFFER_CHAR;
yy_current_buffer->yy_ch_buf[yy_n_chars + 1] = YY_END_OF_BUFFER_CHAR;

yytext_ptr = &yy_current_buffer->yy_ch_buf[0];

return ret_val;
}

/* yy_get_previous_state - get the state just before the EOB char was reached */

static yy_state_type yy_get_previous_state()
{
register yy_state_type yy_current_state;
register char *yy_cp;

yy_current_state = yy_start;

for ( yy_cp = yytext_ptr + YY_MORE_ADJ; yy_cp < yy_c_buf_p; ++yy_cp )
{
register YY_CHAR yy_c = (*yy_cp ? yy_ec[YY_SC_TO_UI(*yy_cp)] : 1);
if ( yy_accept[yy_current_state] )
{
yy_last_accepting_state = yy_current_state;
yy_last_accepting_cpos = yy_cp;
}
while ( yy_chk[yy_base[yy_current_state] + yy_c] != yy_current_state )
{
yy_current_state = (int) yy_def[yy_current_state];
if ( yy_current_state >= 41 )
yy_c = yy_meta[(unsigned int) yy_c];
}
yy_current_state = yy_nxt[yy_base[yy_current_state] + (unsigned int) yy_c];
}

return yy_current_state;
}

/* yy_try_NUL_trans - try to make a transition on the NUL character
*
* synopsis
* next_state = yy_try_NUL_trans( current_state );
*/

#ifdef YY_USE_PROTOS
static yy_state_type yy_try_NUL_trans( yy_state_type yy_current_state )
#else
static yy_state_type yy_try_NUL_trans( yy_current_state )
yy_state_type yy_current_state;
#endif
{
register int yy_is_jam;
register char *yy_cp = yy_c_buf_p;

register YY_CHAR yy_c = 1;
if ( yy_accept[yy_current_state] )
{
yy_last_accepting_state = yy_current_state;
yy_last_accepting_cpos = yy_cp;
}
while ( yy_chk[yy_base[yy_current_state] + yy_c] != yy_current_state )
{
yy_current_state = (int) yy_def[yy_current_state];
if ( yy_current_state >= 41 )
yy_c = yy_meta[(unsigned int) yy_c];
}
yy_current_state = yy_nxt[yy_base[yy_current_state] + (unsigned int) yy_c];
yy_is_jam = (yy_current_state == 40);

return yy_is_jam ? 0 : yy_current_state;
}

#ifndef YY_NO_UNPUT
#ifdef YY_USE_PROTOS
static void yyunput( int c, register char *yy_bp )
#else
static void yyunput( c, yy_bp )
int c;
register char *yy_bp;
#endif
{
register char *yy_cp = yy_c_buf_p;

/* undo effects of setting up yytext */
*yy_cp = yy_hold_char;

if ( yy_cp < yy_current_buffer->yy_ch_buf + 2 )
{ /* need to shift things up to make room */
/* +2 for EOB chars. */
register int number_to_move = yy_n_chars + 2;
register char *dest = &yy_current_buffer->yy_ch_buf[
yy_current_buffer->yy_buf_size + 2];
register char *source =
&yy_current_buffer->yy_ch_buf[number_to_move];

while ( source > yy_current_buffer->yy_ch_buf )
*--dest = *--source;

yy_cp += (int) (dest - source);
yy_bp += (int) (dest - source);
yy_current_buffer->yy_n_chars =
yy_n_chars = yy_current_buffer->yy_buf_size;

if ( yy_cp < yy_current_buffer->yy_ch_buf + 2 )
YY_FATAL_ERROR( "flex scanner push-back overflow" );
}

*--yy_cp = (char) c;

yytext_ptr = yy_bp;
yy_hold_char = *yy_cp;
yy_c_buf_p = yy_cp;
}
#endif /* ifndef YY_NO_UNPUT */

#ifdef __cplusplus
static int yyinput()
#else
static int input()
#endif
{
int c;

*yy_c_buf_p = yy_hold_char;

if ( *yy_c_buf_p == YY_END_OF_BUFFER_CHAR )
{
/* yy_c_buf_p now points to the character we want to return.
* If this occurs *before* the EOB characters, then it's a
* valid NUL; if not, then we've hit the end of the buffer.
*/
if ( yy_c_buf_p < &yy_current_buffer->yy_ch_buf[yy_n_chars] )
/* This was really a NUL. */
*yy_c_buf_p = '\0';

else
{ /* need more input */
int offset = yy_c_buf_p - yytext_ptr;
++yy_c_buf_p;

switch ( yy_get_next_buffer() )
{
case EOB_ACT_LAST_MATCH:
/* This happens because yy_g_n_b()
* sees that we've accumulated a
* token and flags that we need to
* try matching the token before
* proceeding. But for input(),
* there's no matching to consider.
* So convert the EOB_ACT_LAST_MATCH
* to EOB_ACT_END_OF_FILE.
*/

/* Reset buffer status. */
yyrestart( yyin );

/* fall through */

case EOB_ACT_END_OF_FILE:
{
if ( yywrap() )
return EOF;

if ( ! yy_did_buffer_switch_on_eof )
YY_NEW_FILE;
#ifdef __cplusplus
return yyinput();
#else
return input();
#endif
}

case EOB_ACT_CONTINUE_SCAN:
yy_c_buf_p = yytext_ptr + offset;
break;
}
}
}

c = *(unsigned char *) yy_c_buf_p; /* cast for 8-bit char's */
*yy_c_buf_p = '\0'; /* preserve yytext */
yy_hold_char = *++yy_c_buf_p;

return c;
}

#ifdef YY_USE_PROTOS
void yyrestart( FILE *input_file )
#else
void yyrestart( input_file )
FILE *input_file;
#endif
{
if ( ! yy_current_buffer )
yy_current_buffer = yy_create_buffer( yyin, YY_BUF_SIZE );

yy_init_buffer( yy_current_buffer, input_file );
yy_load_buffer_state();
}

#ifdef YY_USE_PROTOS
void yy_switch_to_buffer( YY_BUFFER_STATE new_buffer )
#else
void yy_switch_to_buffer( new_buffer )
YY_BUFFER_STATE new_buffer;
#endif
{
if ( yy_current_buffer == new_buffer )
return;

if ( yy_current_buffer )
{
/* Flush out information for old buffer. */
*yy_c_buf_p = yy_hold_char;
yy_current_buffer->yy_buf_pos = yy_c_buf_p;
yy_current_buffer->yy_n_chars = yy_n_chars;
}

yy_current_buffer = new_buffer;
yy_load_buffer_state();

/* We don't actually know whether we did this switch during
* EOF (yywrap()) processing, but the only time this flag
* is looked at is after yywrap() is called, so it's safe
* to go ahead and always set it.
*/
yy_did_buffer_switch_on_eof = 1;
}

#ifdef YY_USE_PROTOS
void yy_load_buffer_state( void )
#else
void yy_load_buffer_state()
#endif
{
yy_n_chars = yy_current_buffer->yy_n_chars;
yytext_ptr = yy_c_buf_p = yy_current_buffer->yy_buf_pos;
yyin = yy_current_buffer->yy_input_file;
yy_hold_char = *yy_c_buf_p;
}

#ifdef YY_USE_PROTOS
YY_BUFFER_STATE yy_create_buffer( FILE *file, int size )
#else
YY_BUFFER_STATE yy_create_buffer( file, size )
FILE *file;
int size;
#endif
{
YY_BUFFER_STATE b;

b = (YY_BUFFER_STATE) yy_flex_alloc( sizeof( struct yy_buffer_state ) );
if ( ! b )
YY_FATAL_ERROR( "out of dynamic memory in yy_create_buffer()" );

b->yy_buf_size = size;

/* yy_ch_buf has to be 2 characters longer than the size given because
* we need to put in 2 end-of-buffer characters.
*/
b->yy_ch_buf = (char *) yy_flex_alloc( b->yy_buf_size + 2 );
if ( ! b->yy_ch_buf )
YY_FATAL_ERROR( "out of dynamic memory in yy_create_buffer()" );

b->yy_is_our_buffer = 1;

yy_init_buffer( b, file );

return b;
}

#ifdef YY_USE_PROTOS
void yy_delete_buffer( YY_BUFFER_STATE b )
#else
void yy_delete_buffer( b )
YY_BUFFER_STATE b;
#endif
{
if ( ! b )
return;

if ( b == yy_current_buffer )
yy_current_buffer = (YY_BUFFER_STATE) 0;

if ( b->yy_is_our_buffer )
yy_flex_free( (void *) b->yy_ch_buf );

yy_flex_free( (void *) b );
}

#ifndef _WIN32
#include <unistd.h>
#else
#ifndef YY_ALWAYS_INTERACTIVE
#ifndef YY_NEVER_INTERACTIVE
extern int isatty YY_PROTO(( int ));
#endif
#endif
#endif

#ifdef YY_USE_PROTOS
void yy_init_buffer( YY_BUFFER_STATE b, FILE *file )
#else
void yy_init_buffer( b, file )
YY_BUFFER_STATE b;
FILE *file;
#endif

{
yy_flush_buffer( b );

b->yy_input_file = file;
b->yy_fill_buffer = 1;

#if YY_ALWAYS_INTERACTIVE
b->yy_is_interactive = 1;
#else
#if YY_NEVER_INTERACTIVE
b->yy_is_interactive = 0;
#else
b->yy_is_interactive = file ? (isatty( fileno(file) ) > 0) : 0;
#endif
#endif
}

#ifdef YY_USE_PROTOS
void yy_flush_buffer( YY_BUFFER_STATE b )
#else
void yy_flush_buffer( b )
YY_BUFFER_STATE b;
#endif

{
if ( ! b )
return;

b->yy_n_chars = 0;

/* We always need two end-of-buffer characters. The first causes
* a transition to the end-of-buffer state. The second causes
* a jam in that state.
*/
b->yy_ch_buf[0] = YY_END_OF_BUFFER_CHAR;
b->yy_ch_buf[1] = YY_END_OF_BUFFER_CHAR;

b->yy_buf_pos = &b->yy_ch_buf[0];

b->yy_at_bol = 1;
b->yy_buffer_status = YY_BUFFER_NEW;

if ( b == yy_current_buffer )
yy_load_buffer_state();
}

#ifndef YY_NO_SCAN_BUFFER
#ifdef YY_USE_PROTOS
YY_BUFFER_STATE yy_scan_buffer( char *base, yy_size_t size )
#else
YY_BUFFER_STATE yy_scan_buffer( base, size )
char *base;
yy_size_t size;
#endif
{
YY_BUFFER_STATE b;

if ( size < 2 ||
base[size-2] != YY_END_OF_BUFFER_CHAR ||
base[size-1] != YY_END_OF_BUFFER_CHAR )
/* They forgot to leave room for the EOB's. */
return 0;

b = (YY_BUFFER_STATE) yy_flex_alloc( sizeof( struct yy_buffer_state ) );
if ( ! b )
YY_FATAL_ERROR( "out of dynamic memory in yy_scan_buffer()" );

b->yy_buf_size = size - 2; /* "- 2" to take care of EOB's */
b->yy_buf_pos = b->yy_ch_buf = base;
b->yy_is_our_buffer = 0;
b->yy_input_file = 0;
b->yy_n_chars = b->yy_buf_size;
b->yy_is_interactive = 0;
b->yy_at_bol = 1;
b->yy_fill_buffer = 0;
b->yy_buffer_status = YY_BUFFER_NEW;

yy_switch_to_buffer( b );

return b;
}
#endif

#ifndef YY_NO_SCAN_STRING
#ifdef YY_USE_PROTOS
YY_BUFFER_STATE yy_scan_string( yyconst char *yy_str )
#else
YY_BUFFER_STATE yy_scan_string( yy_str )
yyconst char *yy_str;
#endif
{
int len;
for ( len = 0; yy_str[len]; ++len )
;

return yy_scan_bytes( yy_str, len );
}
#endif

#ifndef YY_NO_SCAN_BYTES
#ifdef YY_USE_PROTOS
YY_BUFFER_STATE yy_scan_bytes( yyconst char *bytes, int len )
#else
YY_BUFFER_STATE yy_scan_bytes( bytes, len )
yyconst char *bytes;
int len;
#endif
{
YY_BUFFER_STATE b;
char *buf;
yy_size_t n;
int i;

/* Get memory for full buffer, including space for trailing EOB's. */
n = len + 2;
buf = (char *) yy_flex_alloc( n );
if ( ! buf )
YY_FATAL_ERROR( "out of dynamic memory in yy_scan_bytes()" );

for ( i = 0; i < len; ++i )
buf[i] = bytes[i];

buf[len] = buf[len+1] = YY_END_OF_BUFFER_CHAR;

b = yy_scan_buffer( buf, n );
if ( ! b )
YY_FATAL_ERROR( "bad buffer in yy_scan_bytes()" );

/* It's okay to grow etc. this buffer, and we should throw it
* away when we're done.
*/
b->yy_is_our_buffer = 1;

return b;
}
#endif

#ifndef YY_NO_PUSH_STATE
#ifdef YY_USE_PROTOS
static void yy_push_state( int new_state )
#else
static void yy_push_state( new_state )
int new_state;
#endif
{
if ( yy_start_stack_ptr >= yy_start_stack_depth )
{
yy_size_t new_size;

yy_start_stack_depth += YY_START_STACK_INCR;
new_size = yy_start_stack_depth * sizeof( int );

if ( ! yy_start_stack )
yy_start_stack = (int *) yy_flex_alloc( new_size );

else
yy_start_stack = (int *) yy_flex_realloc(
(void *) yy_start_stack, new_size );

if ( ! yy_start_stack )
YY_FATAL_ERROR(
"out of memory expanding start-condition stack" );
}

yy_start_stack[yy_start_stack_ptr++] = YY_START;

BEGIN(new_state);
}
#endif

#ifndef YY_NO_POP_STATE
static void yy_pop_state()
{
if ( --yy_start_stack_ptr < 0 )
YY_FATAL_ERROR( "start-condition stack underflow" );

BEGIN(yy_start_stack[yy_start_stack_ptr]);
}
#endif

#ifndef YY_NO_TOP_STATE
static int yy_top_state()
{
return yy_start_stack[yy_start_stack_ptr - 1];
}
#endif

#ifndef YY_EXIT_FAILURE
#define YY_EXIT_FAILURE 2
#endif

#ifdef YY_USE_PROTOS
static void yy_fatal_error( yyconst char msg[] )
#else
static void yy_fatal_error( msg )
char msg[];
#endif
{
(void) fprintf( stderr, "%s\n", msg );
exit( YY_EXIT_FAILURE );
}

/* Redefine yyless() so it works in section 3 code. */

#undef yyless
#define yyless(n) \
do \
{ \
/* Undo effects of setting up yytext. */ \
yytext[yyleng] = yy_hold_char; \
yy_c_buf_p = yytext + n; \
yy_hold_char = *yy_c_buf_p; \
*yy_c_buf_p = '\0'; \
yyleng = n; \
} \
while ( 0 )

/* Internal utility routines. */

#ifndef yytext_ptr
#ifdef YY_USE_PROTOS
static void yy_flex_strncpy( char *s1, yyconst char *s2, int n )
#else
static void yy_flex_strncpy( s1, s2, n )
char *s1;
yyconst char *s2;
int n;
#endif
{
register int i;
for ( i = 0; i < n; ++i )
s1[i] = s2[i];
}
#endif

#ifdef YY_NEED_STRLEN
#ifdef YY_USE_PROTOS
static int yy_flex_strlen( yyconst char *s )
#else
static int yy_flex_strlen( s )
yyconst char *s;
#endif
{
register int n;
for ( n = 0; s[n]; ++n )
;

return n;
}
#endif

#ifdef YY_USE_PROTOS
static void *yy_flex_alloc( yy_size_t size )
#else
static void *yy_flex_alloc( size )
yy_size_t size;
#endif
{
return (void *) malloc( size );
}

#ifdef YY_USE_PROTOS
static void *yy_flex_realloc( void *ptr, yy_size_t size )
#else
static void *yy_flex_realloc( ptr, size )
void *ptr;
yy_size_t size;
#endif
{
/* The cast to (char *) in the following accommodates both
* implementations that use char* generic pointers, and those
* that use void* generic pointers. It works with the latter
* because both ANSI C and C++ allow castless assignment from
* any pointer type to void*, and deal with argument conversions
* as though doing an assignment.
*/
return (void *) realloc( (char *) ptr, size );
}

#ifdef YY_USE_PROTOS
static void yy_flex_free( void *ptr )
#else
static void yy_flex_free( ptr )
void *ptr;
#endif
{
free( ptr );
}

#if YY_MAIN
int main()
{
yylex();
return 0;
}
#endif
#line 163 "flex2.lex"

static char kw[NUM_KW][MAX_KW_LEN] = {
"if", "enum", "else", "while", "return", "int"
};

int Hash(char *key){
int hashval = 0;

while (*key != '\0')
hashval += *key++;

return (hashval % HASHSIZE);
}

/* register keyword in hash table */
void InitHTable(void)
{
int i;
struct list *p, *q;
int hashval;

for (i = 0; i < NUM_KW; i++) {
if ((FindKeyWord(kw[i])) == FALSE) {
if ((p = (struct list *)malloc(sizeof(struct list))) == NULL) {
fprintf(stderr, "can't allocate memory\n");
exit(2);
}

strcpy((*p).keyword, kw[i]);
hashval = Hash(kw[i]);

if (hashtable[hashval] == NULL) {
hashtable[hashval] = p;
p->next = NULL;
}
else { /* Case: already registerd */
q = hashtable[hashval];
while (q->next != NULL)
q = q->next;
q->next = p;
p->next = NULL;
}
}
}
}

/* try to find out keyword in hash table */
int FindKeyWord(char *key)
{
struct list *p;
/*printf("Hash:%d\n", Hash(key));*/

for (p = hashtable[Hash(key)]; p != NULL; p = p->next)
if (!strcmp(key, (*p).keyword))
return (TRUE);

return (FALSE);
}

void FreeKeyWord(void)
{
int i;
struct list *p, *q;

for (i = 0; i < HASHSIZE; i++)
for (p = hashtable[i]; p != NULL; ) {
q = p->next;
free(p);
p = q;
}
}

void yyerror(){
printf("ERROR: ???\n");
}

int main(int argc, char *argv[]){
char word[MAX_KW_LEN];
InitHTable();
++argv, --argc; /* skip over program name */
if(argc > 0)
yyin = fopen(argv[0],"r");
else
yyin = stdin;
yylex();
printf("-------\nEND\n");
FreeKeyWord();
}

int yywrap(){ return 1;}

2008-05-23

Recursive Decent Programming

I made two easy programs below by means of recursive decent, which benefits for programmers as programming recursive method. As an example of it, these programs are used several recursive methods, but they run properly because I only programmed along syntax rule of regular expression.

No.1 caliculate.c


#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int get_cnt;
char ch;
char* s;

int logic();

void error(){
 printf("error");
 exit(0);
}

int ch2int(char c){
 if(ch=='0') return 0;
 else if(ch=='1') return 1;
 else error();
}

void get_char(void){
 ch = s[get_cnt];
 get_cnt++;
}

int logic_val(){
 int val;
 if('0'<=ch && ch<='9'){
  val = ch - '0';
  printf("%d\n",val);
  get_char();
  return val;
 }else error();
}

int element(){
 int tmp;
 if(ch=='('){
  printf("%c\n",ch);
  get_char();
  tmp = logic();
  if(ch==')'){
   printf("%c\n",ch);
   get_char();
  }else error();
  return tmp;
 }else return logic_val();
}

int operand(){
 if(ch=='-'){
  printf("%c\n",ch);
  get_char();
  return -(element());
 }
 return element();
}

int logic(){
 int tmp;
 tmp = operand();
 while(ch=='*'||ch=='+'){
  if(ch=='*'){
   printf("%c\n",ch);
   get_char();
   tmp *= operand();
  }else if(ch=='+'){
   printf("%c\n",ch);
   get_char();
   tmp += operand();
  }
 }
 return tmp;
}

int main(int argc,char *argv[]){
 int i,result;
 get_cnt=0;
 s = argv[1];
 get_char();
 result = logic();
 printf("result: %d\n",result);
}

No.2 notation.c


#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int get_cnt, temp;
char ch;
char* s;
int digit;

int logic();

void error(){
 printf("error");
 exit(0);
}

void get_char(void){
 ch = s[get_cnt];
 get_cnt++;
}

int val(){
 int val;
 if('0'<=ch && ch<='9'){
  val = ch - '0';
  printf("%d\n",val);
  get_char();
  return val;
 }else error();
}


void binary(){
 int val;
 if(ch=='0' || ch=='1'){
  val = ch - '0';
  get_char();
  if(ch!='\0'){
   binary();
  }
  temp += val * digit;
  digit*=2;
 }else error();
}

void octal(){
 int val;
 if('0'<=ch && ch<='7'){
  val = ch - '0';
  get_char();
  if(ch!='\0'){
   octal();
  }
  temp += val * digit;
  digit *= 8;
 }else error();
}

void decimal(){
 int val;
 if('0'<=ch && ch<='9'){
  val = ch - '0';
  get_char();
  if(ch!='\0'){
   decimal();
  }
  temp += val * digit;
  digit *= 10;
 }else error();
}

int notation(void){
 int tmp;
 if(ch=='2'){
  get_char();
  if(ch=='#'){
   get_char();
   binary();
   return temp;
  }
 }else if(ch=='8'){
  get_char();
  if(ch=='#'){
   get_char();
   octal();
   return temp;
  }
 }
 
 if('0'<=ch && ch<='9'){
  decimal();
  return temp;
 }else error();
}
 

int main(int argc,char *argv[]){
 int result;
 temp = 0;
 digit=1;
 get_cnt=0;
 s = argv[1];
 get_char();
 result = notation();
 printf("result: %d\n",result);
}

　今日友達に「意志が弱い」って言われた。
　大抵は終電まで大学にいるけど、それでも家が近いから帰ってもまだ深夜一時位だ。それならあと２時間は勉強できるはずなのに、勉強してない。それが出来ない原因は俺の意志が弱いからだ、って言われた。

　核心つきすぎて笑えたｗ確かにｗ

　OS advance courceとlanguage processorのおかげで、どれだけ勉強しても多分taskが0になる事はほぼない。というか、UMLとかdata miningとかCPUの中の人とか、色々勉強したいこともある。だから、勉強出来る時間と体力が残っていれば、少しでも取り掛かるべきだ。でなければ、友達や天才ハカーに追いつくことは出来ないし、一方的に離されていくだけだ。ただでさえ動くのが遅かったのに。

　すぐに現状に満足してしまうのは悪い癖だ。まだまだ精神的に幼稚。もっと自分自身を律する生活を心がけよう。

2008-05-19

Lex

なんとか明日に入る前にcommentの字句解析 Lex versionが終わった～！あとは、残ったしょぼい構文規則書いて、今まで作った字句解析と出力内容合わせるだけだな。うむ、楽勝だわ。

そろそろ課題+中間ラッシュかな。なるべくOS勉強する時間を減らさないように留意しよう。Computer IntelligenceのPrologもそこそこ難しい内容に入ってきたし、Language Processorは相変わらずだし、今後の情報実験は苦手な信号処理の内容に入っていくわけだけど、こんなところで躓いてたら笑えないわ。もし躓いたとしたら、きっと俺の意志がカスだった、ってことだろう。

やるべきことはガンガン処理して、時間増やしまくって、OS入門をどんどん進ませよう。

追記：
あんだけ頑張って作ったlexical analysisを、一瞬で再現してしまうLexは凄すぎるｗ。そして、そのLexを作ったEric Emerson SchmidtとMike Leskはどう考えても天才ハカーｗ

3年生になってようやくSecure Shell使ってwaseda Linux terminalにloginする方法を覚えた俺とはレベルが違い過ぎるな。俺ももっとcomputer scienceのskillを上げて、色々できるようになりたいわ。とかいいつつ、未だにlaptopがwindowsの時点で自己矛盾o...rz

2008-05-11

Test:Highlight Code

compiler出来たので、試しにHighlight Codeを使ってみるテスト。
→読み込みが遅すぎるのでSyntaxHifhlighterに変更

-----------------------------


/**** function declaration ****/
statement();

/**** files and options ****/
int  f_source;  /* source file */

/**** source program input  ****/

enum{ UNPACK=4 };
  /* WL assumes sizeof(int)=4;                            */
enum{ M_line=25, M_line_c=M_line*UNPACK };
int  line[M_line], line_c[M_line_c], lcp, lcm;
  /* a character string is packed into an int array with  */
  /* a terminator '\0'; it will be unpacked into another  */
  /* array one charater per element.                      */
  /* line holds the current line in packed form;          */
  /* line_c holds the current line in unpacked form,      */
  /* whereas lcm shows the number of characters in it,    */
  /* and lcp points to the next character position.       */
enum{ M_char=0x80 }; /* characters are in 7bit code       */
enum{ EOFC=0x04 };   /* EOF state is coded to this value. */
int  ch;       /* the next character available */
int  line_no;  /* current line number */

init_char(){
  f_source= stdin;  /* <== This line will be removed. */
  lcp= lcm= 0;  line_no= 0;
}

/* 一文字取ってくる関数 */
get_char(){
  if( lcp>=lcm ){
    if( fgets(line, M_line, f_source)==NULL ){
      line_c[0]= EOFC;  lcm= 1;
    }else{
      lcm= unpack(line, line_c);
    }
    lcp= 0;  line_no= line_no+1;
  }
  ch= line_c[lcp]%M_char;  /* filter to 7bits */
  if( ch!=EOFC )  lcp= lcp+1;
    /* not to try input again for an EOF'ed file */
}


/**** output ****/

/* number */

/* output a positive number into a file */
/*positive(int number, int file){
  if( number>=10 )  positive(number/10, file);
  fputc(number%10+'0', file);
}*/

/* output a number in decimal onto a file */
/*decimal(int number, int file){
  if( number<0 ){
    number= -number;  fputc('-', file); 
    if( number<0 ){ fputs("2147483648", file); return; }
  }
  positive(number, file);
}*/

/** number output **/
positive(int v, int file){ /*正の整数を表示する*/
  if( v>=10 )  positive(v/10, file);
  fputc(v%10+'0', file);
}
decimal(int v, int file){ /*整数を表示するための関数*/
  if( v<0 ){
    fputc('-', file);
    positive(-v, file);
  }else{
    positive(+v, file);
  }
}

/** system error(abortion) **/
system_error(int message[]){
  decimal(line_no, stderr);  fputs("::SYSTEM::", stderr);
  fputs(message, stderr); fputc('\n', stderr);
  exit(1);
}
/* error message output: to stderr */
/*system_error(int message[]){*/
/* output a message and abort the compilation */
/*  decimal(line_no, stderr); fputs("::SYSTEM::", stderr);
  fputs(message, stderr); fputc('\0', stderr);
  exit(-1);
}*/

/**** token output ****/
/*enum{
  Else, Enum, If, Int, Return, While,
  Name, Const, Character, String,
  Plus, Minus, Not, Amper,
  Times, Divide, Remainder,
  Les, Leq, Grat, Geq,
  Equ, Neq,
  And, Or,
  Becomes,
  Comma, Semicolon, Lpar, Rpar, Lbracket, Rbracket, Lbrace, Rbrace,
  Nontoken, Endfile, Comment
};*/
enum{
  Nontoken,
  Times, Divide, Remainder,
  Les, Leq, Grat, Geq,
  Equ, Neq,
  And, Or,
  Becomes,
  Lbracket,
  Comma, Rpar, Rbracket,
  Else, Rbrace, Semicolon,
  If, Return, While, Lbrace,
  Lpar, Plus, Minus, Not, Amper,
  Const, Character, String,
  Name, Int, Enum,
  Endfile, Comment
};


int  token;  /* next token */
int  value;  /* integer value for token=Const,Character */
int  spell;  /* spelling for token=Name */
int  word[M_line];  /* contents for token=String, Comment */

/**** Name identity ****/
enum{ id_p, next_p, spell_p };  /* spell_table's components */

enum{ M_spell_table=2000 };                    /* size of spell_table */
int  spell_table[M_spell_table], spell_index;  /* for usual names */
  /* any name is identified with a positon p in this table;   */
  /*  spell_table[p+id_p]:    open for clients(defaults to 0) */
  /*  spell_table[p+next_p]:  link to similar spellings       */
  /*  spell_table[p+spell_p] and upward: packed spelling      */



/**** hashing ****/

/** Reserved Words  **/
enum{ M_reserved=6 };   /* number of reserved words */
enum{ M_spell_r=11  };  /* size of reserved word spelling */
enum{ P_r= 2, M_r= 7 }; /* hash function parameters */
enum{ NIL=-1 };

int token_r[M_r];            /* token for each reserved word */
int spell_r[M_spell_r], srp; /* spell table for reserved words */
int spell_rp[M_r];           /* pointer to their spelling */


int h(int spell_c[], int length, int P, int M){ int i, v;
/* 第3、第4引数を元にハッシュ関数を実行する */
/* computes a hash function value for an unpacked spell */
/* with parameters P and M.                             */

  v= 0; i= 0;
  /* iが配列の長さと同じになるまで、
    v=(v*第3引数+第1引数)&第4引数を実行し、
    iをインクリメントする作業を繰り返す。
  その後、vの値を返す。      */
  while( i!=length ){
    v= (v*P + spell_c[i])%M;  i= i+1;
  }
  return v;
}


enter_reserved(int spell[], int token_value){
/* 予約語を設定する関数 */
/* helper function: 関数init_reservedで使われる */

  int spell_c[M_line_c], scm;  /* unpacked spell */
  int r;

  scm= unpack(spell, spell_c); /* spellの分解表現をspell_cに書き込む */
  r= h(spell_c, scm, P_r, M_r); /* 各変数を元にハッシュにかける */
  spell_rp[r]= srp;    /* ハッシュ値の場所にsrpを格納する */
  token_r[r]= token_value;  /* ハッシュ値の場所にtoken_valueを格納する */
 
  /* spell_cの分解表現をspell_r[srp]が指す番地に、scmの長さだけ書き込む */
  srp= srp+pack(spell_c, scm, &spell_r[srp]);
}

init_reserved(){ int i;
/* 予約語の初期化 */
/* sets up tables related to the reserved words */

  i= 0;
  while( i!=M_r ){
   token_r[i]= Name;
 spell_rp[i]= 0;
 i= i+1;
  }
  spell_r[0]= 0;  srp= 1;
  enter_reserved("else",  Else);
  enter_reserved("enum",  Enum);
  enter_reserved("if",    If);
  enter_reserved("int",   Int);
  enter_reserved("return",Return);
  enter_reserved("while", While);
  if( srp!=M_spell_r ) system_error("reserved word preparation");
}

int is_reserved(int spell_c[], int length){
/* 引数として渡されたspellが予約語かどうかをチェックする関数 */
/* checks if an unpacked spell is a reserved word:  */
/* returns its token value if it is,                */
/* returns Name if not.                             */

  int  spell[M_line], sm;
  int  r, i, s;

  r= h(spell_c, length, P_r, M_r);
  s= spell_rp[r];
  sm= pack(spell_c, length, spell);
  i= 0; while( i!=sm && spell[i]==spell_r[s+i] ) i= i+1;
  if( i==sm ) token= token_r[r];
  else        token= Name;
  return (token!=Name);
}

/** usual names **/
enum{ P_u=53, M_u=128 };  /* hash function parameters for usual names */

int hash[M_u];

/* 配列hashを初期化する関数 */
init_names(){ int i;
  i= 0; while( i!=M_u ){ hash[i]= NIL;  i= i+1; }
  spell_index= 0;
}

int find(int spell_c[], int length){
/* 表からspellの位置を探し、もしまだ登録していなかったらそのspellを登録する関数 */
/* finds the position in spell_table of an unpacked spell. */
/* registers the spell if not yet registered.              */

  int spell[M_line], sm;  int  h0, f, r, i;
  enum{ sentinel=0x7FFFFFFF }; /* for table lookup */

  h0= h(spell_c, length, P_u, M_u);  f= hash[h0];
  sm= pack(spell_c, length, spell);  spell[sm]= sentinel;
  i= 0;
  while( f!=NIL && spell[i]!=sentinel ){
    i= 0; while( spell[i]==spell_table[f+spell_p+i] ) i= i+1;
    r= f;  f= spell_table[f+next_p];
  }
  if( spell[i]==sentinel )  return r;
  if( spell_index+spell_p+sm>M_spell_table )
    system_error("spell_table overflow");
  r= spell_index; 
  spell_table[r+id_p]= 0;
  spell_table[r+next_p]= hash[h0];  hash[h0]= r;
  i= 0; while( i!=sm ){ spell_table[r+spell_p+i]= spell[i]; i= i+1; }
  spell_index= spell_index+spell_p+sm;
  return r;
}

/**** characterization ****/

enum{ SPACE, ONE, TWO, LETTER, H_LETTER, O_DIGIT, DIGIT };
 /*  SPACE: space, new line, etc.                                     */
 /*  ONE:   +, -, etc., which forms a token by itself.                */
 /*  TWO:   <, =, etc., which forms a token by itself or              */
 /*                           forms a token with the next character.  */
 /*  LETTER: lower-case, upper-case letters and _; except             */
 /*  H_LETTER: a-f and A-F, which are used also as hexadecimal digits.*/
 /*  O_DIGIT: 0-7, which are used as octal digits.                    */
 /*  DIGIT:   8, 9                                                    */
enum{ p0=0x1, p1=0x100, p2=0x10000, p3=0x1000000 };
 /*  positioning for packup representation  */
 /*  SPACE*p3                                       */
 /*  ONE  *p3                            + token*p0 */
 /*  TWO  *p3 + token'*p2 + next-char*p1 + token*p0 */
 /*  kind *p3                            + value*p0 */
 /*     (kind= LETTER, H_LETTER, O_DITIT, DIGIT)    */
int char_type[M_char];

enum{ ERR=-3, HEX=-2, OCT=-1 };
int esc_value[M_char];
 /* character value in escaped character representation */
 /* those values are non-negative;                      */
 /* negative value indicates need of special care       */

/* character typeをセットする関数 */
set_ctype(int c, int kind, int token_n, int char_n, int token){
 /* helper function used by init_ctype                      */
 /*  char_type[c]= <kind, token_n, char_n, token> (packup)  */

 char_type[c]=kind*p3+token_n*p2+char_n*p1+token*p0;
}

init_ctype(){ int c;  enum{ _ =0 };
  /* sets up char_type and esc_value */

  c= 0; while( c!=M_char ){ set_ctype(c, ONE,_,_,Nontoken); c= c+1; }
  c= 'a'; while( c<='f' ){ set_ctype(c, H_LETTER,_,_,c-'a'+10); c= c+1; }
  c= 'g'; while( c<='z' ){ set_ctype(c, LETTER,_,_,_); c= c+1; }
  c= 'A'; while( c<='F' ){ set_ctype(c, H_LETTER,_,_,c-'A'+10); c= c+1; }
  c= 'G'; while( c<='Z' ){ set_ctype(c, LETTER,_,_,_); c= c+1; }
  c= '0'; while( c<='7' ){ set_ctype(c, O_DIGIT,_,_,c-'0'); c= c+1; }
  c= '8'; while( c<='9' ){ set_ctype(c, DIGIT,_,_,c-'0'); c= c+1; }
  set_ctype('_',  LETTER,_,_,_);
  set_ctype(EOFC, ONE,_,_,Endfile);
  set_ctype(' ',  SPACE,_,_,_); set_ctype('\n', SPACE,_,_,_);
  set_ctype('\r', SPACE,_,_,_); set_ctype('\t', SPACE,_,_,_);
  set_ctype('\v', SPACE,_,_,_); set_ctype('\f', SPACE,_,_,_);
  set_ctype('(', ONE,_,_,Lpar);     set_ctype(')', ONE,_,_,Rpar);
  set_ctype('[', ONE,_,_,Lbracket); set_ctype(']', ONE,_,_,Rbracket);
  set_ctype('{', ONE,_,_,Lbrace);   set_ctype('}', ONE,_,_,Rbrace);
  set_ctype(',', ONE,_,_,Comma);    set_ctype(';', ONE,_,_,Semicolon);
  set_ctype('+', ONE,_,_,Plus);     set_ctype('-', ONE,_,_,Minus);
  set_ctype('*', ONE,_,_,Times);    set_ctype('%', ONE,_,_,Remainder);
  set_ctype('"', ONE,_,_,String);   set_ctype('\'',ONE,_,_,Character);
  set_ctype('=', TWO, Equ,'=', Becomes);
  set_ctype('', TWO, Or, '', Nontoken);
  set_ctype('&', TWO, And,'&', Amper);
  set_ctype('!', TWO, Neq,'=', Not);
  set_ctype('<', TWO, Leq,'=', Les);
  set_ctype('>', TWO, Geq,'=', Grat);
  set_ctype('/', TWO, Comment,'*', Divide);

  c= 0; while( c!=M_char ){ esc_value[c]= c; c= c+1; }
  c= '0'; while( c<='7' ){ esc_value[c]= OCT; c= c+1; }
  esc_value['x']= esc_value['X']= HEX;
  esc_value[EOFC]= esc_value['\n']= ERR;
  esc_value['a']= 0x07; /* \a  alert(BEL) */
  esc_value['b']= 0x08; /* \b  back space(BS) */
  esc_value['f']= 0x0C; /* \f  form feed(FF) */
  esc_value['n']= 0x0A; /* \n  new line(LF) */
  esc_value['r']= 0x0D; /* \r  carriage return(CR) */
  esc_value['t']= 0x09; /* \t  horizontal tab(HT) */
  esc_value['v']= 0x0B; /* \v  vertical tab(VT) */
}



/**** integer constant ****/

/*  integer-constant = decimal  octal  hexadecimal                    */
/*  decimal     = non-zero { digit }                                    */
/*  octal       = '0' { octal-digit }                                   */
/*  hexadecimal = '0' ('x''X') hexadecimal-digit { hexadecimal-digit } */
/* int型の文字を解析する関数 */
integer_constant(){
 /* ch: digit,  token= Const */

  value= 0;
  /* 10進数の場合 */
  if( ch!='0' ){
   /* 数字列が続く間、valueに数字列を書き出していく */
    while( char_type[ch]/p3 >= O_DIGIT ){
      value= value*10 + ch-'0';  get_char();
    }

  /* 8進数または16進数の場合*/
  }else{
    get_char();  /* 次の一文字を読み取る */

 /* 16進数の場合 */
    if( esc_value[ch]==HEX ){
      get_char(); /* 次の一文字を読み取る */
  
   /* chがa-fまたはA-Fまたは数字の間、valueにその値を書き出していく */
      while( char_type[ch]/p3 >= H_LETTER ){
        value= value*0x10 + char_type[ch]%0x10;  get_char();
      }
 
 /* 8進数の場合 */
    }else{
   /* chが0-7の間、valueにその値を書き出していく  */
      while( char_type[ch]/p3 == O_DIGIT ){
        value= value*010 + char_type[ch]%010;  get_char();
      }
    }
  }
}



/**** character constant ****/

/*  character-constant = ''' ( cahracter-c  escaped-chracter ) '''        */
/*  character-c:: any character other than '\' and newline                 */
/*  escaped-character  = simple-escape  octal-escape  hexadecimal-escape */
/*  octal-escape = '\' octal-digit [ octal-digit [ octal-digit ]]          */
/*  hexadecimal-escape = '\' ('x''X') hexadecimal-degit [hexadecimal-digit] */
/*  simple-escape: '\' followed by a character other then one shown above  */

int escaped_character(){ int v;
 /* ch: '\' */

  get_char();  v= esc_value[ch];
  if( v < 0 ){
    if( v==OCT ){ int j;  /* octal-escape */
      v= char_type[ch]%010; j= 1;  get_char();
      while( char_type[ch]/p3 == O_DIGIT ){
        v= v*010 + char_type[ch]%010; j= j+1;  get_char();
      }
      if( j>3 )  token= Nontoken;
    }else
    if( v==HEX ){ int j;  /* hexadecimal-escape */
      get_char();  v= 0;  j= 0;
      while( char_type[ch]/p3 >= H_LETTER ){
        v= v*0x10 + char_type[ch]%0x10; j= j+1;  get_char();
      }
      if( j==0  j>2 )  token= Nontoken;
    }else{ /* illegal character */
      token= Nontoken;  v= 0;
    }
  }else{ /* simple-escape */
    get_char();
  }
  return v;
}

character_constant(){
 /* chのcharacter codeを判定し、結果をvalueに格納する関数 */
 /* ch: one following ''',  token= Character */
 /* sets value with the character code       */

  if( ch==EOFC  ch=='\n'  ch=='\'' ){
    token= Nontoken;
  }else
  if( ch=='\\' ){ /* escaped-character */
    value= escaped_character();
  }else{
    value= ch;  get_char();
  }
  if( ch=='\'' )  get_char();
  else  token= Nontoken;
}
   


/**** string constant ****/

/*  string-constant = '"' { character-s  escaped-character } '"' */
/*  character-s:  any character other than '"' and newline        */

string_constant(){
 /* 入力文字列を読み込み、配列wordに書き出す関数 */
 /* ch: character following '"', token= String */

  int spell_c[M_line_c], scm;

  scm= 0;
 
  /* EOFCや改行、または'"'が来るまでの間、以下の処理を繰り返す */
  while( ch!=EOFC && ch!='\n' && ch!='"' ){int w;
   /* chが\なければ */
    if( ch!='\\'){
      w= ch;  get_char(); /* chをwに格納し、1文字読み込む */
    }else   /* chが\であれば */
      w= escaped_character(); /* 関数escacape_characterを呼び出す */
    spell_c[scm]= w;   /* wを配列spell_c書き込む */
 scm= scm+1;     /* scmのインクリメント */
  }
  /* 次の文字列が'"'の場合は、一文字読み込む */
  if( ch=='"' )  get_char();
  else  token= Nontoken;
  pack(spell_c, scm, word); /* 読み込んだ文字列をwordに書き出す */
}

/**** comment ****/

/*  comment:  begins with '/''*' and ends with '*''/'      */
/*            may extend to several lines                  */

comment(){
 /* ch: character following '/''*', token= Comment  */

  while( ch!=EOFC && ch!='*' )  get_char();
  while( ch=='*' ){
    get_char();
    if( ch=='/'){
      get_char();   return;
    }
    while( ch!=EOFC && ch!='*' ) get_char();
  }
  ch= Endfile;
}


/**** name and reserved word ****/

/*  name = letter { digit  letter }                       */
/*  letter:  lower-case and upper-case letters and '_'     */
/*  digit:   0 - 9                                         */
/*  following are reserved and used for special purposes   */
/*    else  enum  if  int  return  while                   */


identifier(){
  /* tokenを変数名として定義する関数 */
  /* ch: letter,  token= Name */

  int  spell_c[M_line_c], scm;

  scm= 0;
  /* chのtypeがlower-case,upper-case letters and _ */
  /* または、a-f,A-Fのである間        */
  while( char_type[ch]/p3 >= LETTER ){
   /* chを配列spell_cに入れ、次の一文字を読み取る */
    spell_c[scm]= ch;  scm= scm+1;  get_char();
  }
 
  /* 予約語として定義されていれば、関数から抜け出す。 */
  if( is_reserved(spell_c, scm) )  return;
 
  /* 読み取った文字列が既に変数として定義されているかどうかを探す */
  spell= find(spell_c, scm); /* */
}



/**** token(lexical element)  ****/
get_token(){
/* 次のtokenを解析する関数 */
 /* invariant:  ch holds the next character available */

  /* chに入っている値がスペースの間、読み捨て続ける */
  while( char_type[ch]/p3 == SPACE )  get_char();
 
  /* chが空白または演算子の場合 */
  if( char_type[ch]/p3 < LETTER ){ /* ONE or TWO */
    token= char_type[ch]%p1; /*tokenにchの型を格納する*/
   
 /* 1文字だけで判断できる場合 */
 if( char_type[ch]/p3 == ONE ){
  
   /* tokenがEndfileで無ければ、もう一文字読み込み、 */
   /* tokenがString型であれば文字列を読み込み配列wordに書き込む */
   /* tokenがCharacter型であればchのcharacter codeを判定する */
      if( token!=Endfile )  get_char();
      if( token==String ){ string_constant(); return; }
      if( token==Character ){ character_constant(); return; }

 /* 2文字読まなければ判断できない場合 */
    }else{
   int c;
      c= ch;  /* chをcに格納し、      */
   get_char(); /* さらにもう一文字読み込む */
  
   /* 一文字目と二文字目の型が一致した場合 */
      if( ch==(char_type[c]/p1)%p1 ){
        token= (char_type[c]/p2)%p1; /* tokenに型名を格納する */
  get_char();      /* 1文字読み込む */
      }
  
   /* tokenがCommentであれば、関数commentを呼び出す */
      if( token==Comment ){ comment(); get_token(); return; }
    }
  }
 
  /* chが8進数または10進数である場合 */
  else if( char_type[ch]/p3 >= O_DIGIT ){
    token= Const;
 integer_constant();

  /* chが英字である場合(16進数のa～fも含む) */
  }else{
    token= Name; /* tokenにNameを格納し、   */
 identifier(); /* tokenを変数名として定義する  */
  }
}

init_token(){
  init_reserved();
  init_names();
  init_ctype();
  init_char();  get_char();
}


/**** tokenDriver ****/

int representation[100], rprm;
int link_repr[Comment+1];

/* 表記法を登録する関数 */
set_repr(int token, int repr[]){
  int repr_c[20], rprc;
 
  /* reprの配列値の分解表現をrepr_cに書き込む */
  rprc= unpack(repr, repr_c);
  link_repr[token]= rprm; /* link_repr[token]にrprmの値を格納 */
 
  /* repr_cの配列値の詰め込み表現をrepresentation[rprm]の番地に書き込む */
  /* rprmには書き込んだ長さを加算する */
  rprm= rprm + pack(repr_c, rprc, &representation[rprm]);
}

init_repr(){
  rprm= 0;
  set_repr(Else, "else");
  set_repr(Enum, "enum");
  set_repr(If,   "if");
  set_repr(Int,  "int");
  set_repr(Return, "return");
  set_repr(While,"while");
  set_repr(Name, "Name");
  set_repr(Const,"integer");
  set_repr(Character, "character");
  set_repr(String,"string");
  set_repr(Plus, "+");
  set_repr(Minus, "-");
  set_repr(Not, "!");
  set_repr(Amper,"&");
  set_repr(Times, "*");
  set_repr(Divide, "/");
  set_repr(Remainder, "%");
  set_repr(Les, "<");
  set_repr(Leq, "<=");
  set_repr(Grat, ">");
  set_repr(Geq, ">=");
  set_repr(Equ, "==");
  set_repr(Neq, "!=");
  set_repr(And, "&&");
  set_repr(Or,  "");
  set_repr(Becomes,"=");
  set_repr(Comma, ",");
  set_repr(Semicolon, ";");
  set_repr(Lpar, "(");
  set_repr(Rpar, ")");
  set_repr(Lbracket, "[");
  set_repr(Rbracket, "]");
  set_repr(Lbrace, "{");
  set_repr(Rbrace, "}");
  set_repr(Nontoken, "???");
  set_repr(Endfile, "EOF");
  set_repr(Comment, "comment");
}


/* ------------- error execution part ------------- */
/* error message */
int error_count;  /* errors in a compilation */

error(int message[]){
/* output a message */
  decimal(line_no, stderr); fputs("::ERROR::", stderr);
  fputs(message, stderr);  fputc('\n', stderr);
  error_count= error_count+1;
}
missing(int token_repr[]){
/* error: when a token is missing */
  int packed[M_line], unpacked[M_line_c], l;

  l= unpack("missing ", unpacked);
  l= unpack(token_repr, &unpacked[l])+l;
  pack(unpacked, l, packed);  error(packed);
}

skip_to(int target){
  if( token<target ){
    error("some tokens skipped");
    while( token<target ) get_token();
  }
}

init_error(){
/* error_count initialization */
  error_count= 0;
}

/* ------------- parsing part ------------- */
expression();
logical();
arith_term();
logical_term();


/*一次子＝名前｜整数定数｜文字定数｜文字列定数｜‘(’式‘)’*/
primary(){
  if( token==Name       token==Const
    token==Character  token==String ){
    get_token();
  }else
  if( token==Lpar ){
    get_token();
    expression();
    if( token==Rpar ) get_token();  else missing(")");
  }else{
    error("not a primary");
  }
}


/*
算術式＝算術項｛加減演算子 算術項｝
加減演算子＝‘+’｜‘-’
*/
arithmetic(){
  arith_term();
  while( token==Plus  token==Minus ){
    get_token();
    arith_term();
  }
}


/* 論理式 = 論理項｛‘’ 論理項｝*/
logical(){
  logical_term();
  while( token==Or ){
    get_token();
    logical_term();
  }
}



/*
後置式 ＝ 一次子 ［添字並び｜実引数並び］
添字並び ＝ ‘[’式‘]’
実引数並び ＝ ‘(’［式｛‘,’式｝］‘)’
*/
suffixed(){
  primary();
  if( token==Lbracket ){
    get_token();
    expression();
    if( token==Rbracket ) get_token(); else missing("]");
  }else
  if( token==Lpar ){
    get_token();
    if( token!=Rpar ){
      expression();
      while( token==Comma ){
        get_token();
        expression();
      }
    }
    if( token==Rpar ) get_token();  else  missing(")");
  }
}


/*
大小比較 ＝ 算術式｛比較演算子 算術式｝
比較演算子 ＝ ‘<’｜‘<=’ ｜‘>’｜‘>=’
*/
comparison(){
  arithmetic();
  while( token==Les   token==Leq
       token==Grat  token==Geq ){
    get_token();
    arithmetic();
  }
}



/*
前置式 ＝ 後置式｜単項演算子前置式
単項演算子 ＝‘+’｜‘-’｜ ‘!’｜‘&’
*/
prefixed(){
  if( token==Plus  token==Minus
    token==Not   token==Amper ){
    get_token();
    prefixed();
  }else{
    suffixed();
  }
}


/*
等値比較 ＝ 大小比較｛等値演算子 大小比較｝
等値演算子 ＝‘==’｜‘!=’
*/
equality(){
  comparison();
  while( token==Equ  token==Neq ){
    get_token();
    comparison();
  }
}


/* 式 ＝ 論理式｛‘=’論理式｝*/
expression(){
  logical();
  while( token==Becomes ){
    get_token();
    logical();
  }
  skip_to(Comma);
}


/*定数式 ＝ 算術式*/
const_expr(){
  arithmetic();
  skip_to(Comma);
}


/*
 実際の定数式には，定数 (constant)（翻訳時にも計算できる値）を
 表わしたものでなければならない，という制約がつく。
 ここでは，そうした「形」以外から来る条件を無視しているので，
 単に arithmetic を呼び出す形の本体となっている。
*/


/*
算術項 ＝ 前置式｛乗除演算子 前置式｝
乗除演算子 ＝ ‘*’｜‘/’｜‘%’
*/
arith_term(){
  prefixed();
  while( token==Times  token==Divide  token==Remainder ){
    get_token();
    prefixed();
  }
}


/* 論理項 ＝ 等値比較｛‘&&’等値比較｝*/
logical_term(){
  equality();
  while( token==And ){
    get_token();
    equality();
  }
}


/*
 変数宣言 ＝‘int’変数宣言子｛‘,’変数宣言子｝‘;’
 変数宣言子 ＝ 名前［‘[’定数式‘]’］
*/

/* 宣言子を処理する関数 declarator */
declarator(){
  if( token==Lbracket ){
    get_token();
    const_expr();
    if( token==Rbracket ) get_token(); else missing("]");
  }
}

var_decl(){
  declarator();
  while( token==Comma ){
    get_token();
    if( token==Name ) get_token(); else missing("a name");
    declarator();
  }
  if( token==Semicolon )  get_token(); else missing(";");
  skip_to(Rbrace);
}


/*
  列挙宣言 ＝‘enum’‘{’列挙宣言子｛‘,’列挙宣言子｝‘}’‘;’
  列挙宣言子 ＝ 名前［‘=’定数式］
*/
enum_declarator(){
  if( token==Name ){
    get_token();
    if( token==Becomes ){
      get_token();
      const_expr();
    }
  }else{
    error("can't use reserve word");
  }
}

enum_decl(){
  get_token();
  if( token==Lbrace )  get_token(); else missing("{");
  enum_declarator();
  while( token==Comma ){
    get_token();
    enum_declarator();
  }
  if( token==Rbrace ) get_token(); else missing("}");
  if( token==Semicolon ) get_token(); else missing(";");
  skip_to(Rbrace);
}

/* 式文 ＝ 式‘;’*/
/* 式の先頭に現われうる字句であるかどうかを判別する関数 */
int first_expr(int token){
  return (Lpar<=token && token<=Name);
}

expr_statement(){
  expression();
  if( token==Semicolon ) get_token(); else missing(";");
}


/* while文 ＝‘while’‘(’式‘)’文*/
while_statement(){
  get_token();
  if( token==Lpar ) get_token(); else missing("(");
  expression();
  if( token==Rpar ) get_token(); else missing(")");
  statement();
}



/* 区画 ＝‘{’｛変数宣言｜列挙宣言｝｛ 文 ｝‘}’*/
/* 文の先頭に現われうる字句であるかどうかを判別する関数*/
int first_stmt(int token){
  return (Semicolon<=token && token<=Name);
}

block(){
  get_token();
  while( token==Int  token==Enum ){
    if( token==Int ){
      get_token();
      if( token==Name ) get_token(); else missing("a name");
      var_decl();
    }else{
      enum_decl();
    }
  }
  skip_to(Rbrace);
  while( first_stmt(token) ){
    statement(); skip_to(Rbrace);
  }
  if( token==Rbrace ) get_token(); else missing("}");
}


/* if文 ＝‘if’‘(’式‘)’ 文［‘else’文］*/
/* [制約]から，最初の「文」を読み捨てた後に 字句‘then’が現れれば，
   それはこの‘if’に 呼応したものであるので，一緒に処理する*/
if_statement(){
  get_token();
  if( token==Lpar ) get_token(); else missing("(");
  expression();
  if( token==Rpar ) get_token(); else missing(")");
  statement();
  if( token==Else ){
    get_token();
    statement();
  }
}


/* return文 ＝‘return’［式］‘;’*/
return_statement(){
  get_token();
  if( first_expr(token) ){
    expression();
  }
  if( token==Semicolon ) get_token(); else missing(";");
}

/*
  文 ＝ 空文｜式文｜if文｜while文｜return文｜ 区画
  空文＝‘;’
*/
statement(){
  if( token==Semicolon ){
    get_token();  /* empty-statement */
  }else
  if( first_expr(token) ){
    expr_statement();
  }else
  if( token==If ){
    if_statement();
  }else
  if( token==While ){
    while_statement();
  }else
  if( token==Return ){
    return_statement();
  }else
  if( token==Lbrace ){
    block();
  }else{
    error("not a statement");
  }
  skip_to(Else);
}


/*
関数宣言 ＝［‘int’］名前  引数仕様並び‘;’
引数仕様並び ＝‘(’［単引数仕様 ｛‘,’単引数仕様｝］‘)’
単引数仕様 ＝‘int’［ 名前 ］［‘[’‘]’］
関数定義 ＝［‘int’］名前  関数定義本体
関数定義本体 ＝ 仮引数仕様並び  区画
仮引数仕様並び ＝‘(’［仮引数仕様｛‘,’仮引数仕様｝］‘)’
仮引数仕様 ＝［‘int’］名前［‘[’‘]’］
*/
enum{ n_param, n_int, n_name };

/*配列 c を用いてこれらの出現回数を記録しておき，
  関数宣言であるか関数定義であるかが判明した時点で，
  正しい形をしていたかどうかを判断する。*/
parameter(int c[]){ int s;
  s= 0;
  if( token==Int  ){ s= 1;  c[n_int]= c[n_int]+1;   get_token(); }
  if( token==Name ){ s= 1;  c[n_name]= c[n_name]+1; get_token(); }
  if( token==Lbracket ){
    if( s==0 ) error("[ without int or name");
    get_token();
    if( token==Rbracket ) get_token(); else missing("]");
  }
  c[n_param]= c[n_param]+1;
}

func_decl(){
  int c[n_name+1];
  c[n_param]= c[n_int]= c[n_name]= 0;
  if( token==Lpar ) get_token(); else missing("(");
  if( token!=Rpar ){
    parameter(c);
    while( token==Comma ){
      get_token();
      parameter(c);
    }
  }
  if( token==Rpar ) get_token(); else missing(")");
  if( token==Semicolon ){
    if( c[n_param]!=c[n_int] ) error("missing int's in declaration");
    get_token();
  }else
  if( token==Lbrace ){
    if( c[n_param]!=c[n_name] ) error("missing names in definition");
    block();
  }else{
    missing("; or block");
  }
}


/*宣言  ＝  変数宣言 ｜ 列挙宣言 ｜ 関数宣言 ｜ 関数定義
  算譜  ＝  宣言｛ 宣言 ｝*/

/* token が 宣言(declaration)の先頭に来る
   字句(int, enum, 名前) であるとき真になる関数 */
int first_decl(int token){
  return (Name<=token && token<=Enum);
}

declaration(){
  if( token==Int ){
    get_token();
    if( token==Name ) get_token(); else missing("a name");
    if( token==Lpar ){
      func_decl();
    }else{
      var_decl();
    }
  }else
  if( token==Name ){
    get_token();
    func_decl();
  }else
  if( token==Enum ){
    enum_decl();
  }else{
    error("not a declaration");
  }
  skip_to(Name);
}


program(){
  declaration();
  while( first_decl(token) ){
    declaration();
  }
}


main(){
  init_error();
  init_token();
  init_repr(); /* 表記法を初期化する */
  get_token();
  program();
  if( error_count==0 ){
    fputs("Compilation succeeds\n", stdout);
  }else{
    fputs("Compilation fails with ", stdout);
    decimal(error_count, stdout);
    fputs(" errors\n", stdout);
  }
}

登録: 投稿 (Atom)