2011-04-16

Xv6: Modern OS-like Case Study

PDF ver.


Yohei Yasukawa
Operating System
Dec. 15th, 2010
Case Study - Xv6


1. Introduction


Like case study sections in Modern Operating System, this paper introduces about Xv6, an teaching operating system, and shows how it runs in order to deeply understand a operating system concept.


A. What’s Xv6?


Xv6 is a teaching operating system that was developed by MIT in 2002, and being used as a material of operating system courses for graduate students since then.  And it is becoming famous as a well-developed teaching operating system. In fact, in 2010, it is used as a teaching material of operating system courses in not only MIT, but also Rutgers University, Yale University, Johns Hopkins University, and Stanford University, because of its simplicity.

For many years, operating systems had been taught as one of essential courses for understanding computer science. However, there were few operating systems that can be covered in within a semester because the great operating systems, which are well-developed as a teaching material such as BSD and Linux, are too large to cover in a course. Also, the existing operating systems, such as Unix, were so obsolete that many students were struggled to understand the differences between old and new architectures, and between old and new programming languages. These obstacles made it difficult to teach operating system concepts with actual readable source codes. 

To attack this problem, xv6 operating system was developed as a very compacted operating system that implements important operating system concepts which are mainly inspired from Unix version 6, but it is written in ANSI C programming language and runs on x86. So, Xv6 is becoming famous for a good teaching material that is written in readable source codes, and follows current architectures, but its line of codes keeps less than 10,000 and its commentary consists of less than 100 pages, which can be covered in a semester.


B. Related works

To understand perspective of xv6, knowing related works are helpful, so this section briefly introduces two other teaching operating systems, MINIX3 and Haribote OS.

  • MINIX 3


MINIX 3, which is an operating system developed since 1987, is one of other well-known operating systems that is intended for teaching. This operating system is thoroughly explained in the textbook, Operating Systems: Design and Implementation. 

There are two key different features from xv6. One main difference is that MINIX 3 uses micro kernel structure, which is one of concepts on operating system that implementing less features in a kernel results in increasing performances rather than implementing many features in a kernel. In other words, it needs to have well abstracted organizations, although UNIX tends to put all key functions in a kernel. In fact, the kernel in MINIX 3 operating system consists of less than 4,000 lines. Thereby, MINIX 3 operating system is potentially easier to porting other architectures.

 The other key difference is the concept of MINIX 3 operating system. MINIX 3 was developed not only for teaching, but also for embedded systems. Former versions of MINIX, such as MINIX 1 and MINIX 2, were intended only for teaching, but the modern organization, micro kernel, was fit to the purpose in embedded technology fields. So, from MINIX 3, MINIX expands its features to fit to the needs from embedded systems. Thereby, MINIX 3 users are increased from MINIX 3; on the other hand, the size of MINIX 3 is becoming larger, and now the line of codes keeps more than 30,000 lines.

  • Haribote OS


The other teaching operating system is Haribote OS, which was released with the book published in 2006 in Japan. This operating system contributes to understand operating system with actual source codes, because it does not fully cover general design and concepts that other operating systems do, but is intended for learning how operating systems are implemented. So, the book starts from implementing bootstrap from disks, and ends with implementing applications running on the Haribote OS. 

Also, unlike MINIX 3 and xv6, this operating system does not need any previous knowledge. According to the author, this operating system and book was structured for being understood by not only university students, but also junior high school students. In fact, the book kindly explains computer architectures when the readers needs to understand to implement. So, this teaching operating system is friendly for operating system learners, especially at the very beginning level.


C. Overview of Xv6

Operating systems have different aspects in order to adapt a particular situation or to achieve a particular purpose. But most operating systems, especially unix and unix-like operating systems, have very similar designs. The designs should be slightly different one by one, but its essential concept is totally same.

Fundamental Concepts

Xv6 is, as it is inspired from Unix V6, designed based on such shared, essential concepts. Thereby, xv6 can be a concrete example of other general operating systems. So, understanding of xv6 can be useful for any kind of operating systems. On the other hand, because it is an aggregation of general operating systems, most organizations are so simple that it is very understandable but not effective. For example, many operating systems separately use read locks and write locks in order to decrease performance degradation. However, xv6 does not separately use locks but use just a lock, which increases readability and simplicity of source codes. But it decreases the performance.

Interfaces

From this reason, xv6 kernel provides just following system calls:


Obviously, these system calls are familiar with general operating systems, and they run in the almost same way actually. But as you see that the arguments of functions are equal to or less than general ones, each function is simplified.

Like other operating systems, these system calls serve as an interface between kernel space and user space. Processes in user space cannot use the resources held by kernel, but they can use them by calling system calls. For example, to achieve an xv6’s memory allocation organization that can be dynamic, kalloc() and kfree() system calls are used to implement.

In xv6, any kind of organizations follow this concepts. So, when seeing xv6 codes, you can easily understand essential concepts of process, scheduler, locks, traps, etc. in implementation level, but you need to consider how those concepts can be adapted to real worlds for advancing your knowledge.



2. Process in Xv6

Xv6’s process contains instructions, data, and stack in user-space memory. Also, the process information is stored at data structure in a kernel-space. For example, if a process wants to get its process identifier, it can get it by calling pid() system call. So, processes in xv6 implements several organizations by controlling the data with system calls. This section introduces fundamental functions and a few distinct organizations in xv6.

Fundamental Functions

Xv6’s process can create a new process with fork() system call. It lets a process create new child process. Each process has same memory contents but separately managed.


Also, process can stop with exit() and can wait for one of the calling process’s children to exit with wait().


As the program and sequence diagram above illustrates, fork() returns child process id to the parent process, and returns 0 to child process. So, the fork() behaves in the same way as general operating systems. However, exit() and wait() runs in a slightly different way. In general, they need status code in their arguments, but in xv6 they are not needed when calling exit() and wait(). So, when a process calls wait(), it just waits for one of the caller’s children to call exit().

Similarly, xv6’s process has exec() system call to replace the calling process’s memory with a new memory image. The different colors of data in user space illustrates, exec() system call replaces instructions, data, stack data in user space with new ones. So, in xv6,  the memory image except data structure in kernel space are all replaced with a new program when calling exec().


Scheduling

As most operating systems implement multiple process concept, xv6 implements it, too. But the way of scheduling processes works in a different, simpler way. When scheduling processes, red colored members in the following data structures are used.


The colored members are basically same as general designs except ZOMBIE state of process. In xv6, when a process died, it turns into a ZOMBIE process, and it is not removed automatically. The processes in ZOMBIE states do nothing, and wait for being detected from scheduler. If the scheduler notices that there is a ZOMBIE process in a process table, it removes from the table then. Implementing the automatically process-detected system should be smart, but it requires to have some kind of interrupts, which will reduce the simplicity of scheduler. So, ZOMBIE process state functions as just waiting for being removed.

With the colored data members, xv6’s scheduler runs in the following way which corresponds to the program below.
1. If a hardware interrupt is waiting to be handled, the scheduler's CPU will handle it before continuing. 
2. Check the process table looking for a runnable process. 
3. Set up CPU's segment descriptors and current process task state. 
4. Mark the process as RUNNING.
5. Call swtch() to start running new process.
(* swtch() function causes context switching; specifically, it saves the current context and switches to the chosen process’s context.)



These steps are basically same as the ones in general operating systems. The distinct point of xv6’s scheduler is that the scheduler seeks process table entries incrementally. It is so simple that the line of code are within just about 15 lines. However, in this scheduler, there is no way to assign process intentionally, such as priority-based scheduling. So, if there is a process that needs to be done earlier than other processes, it, however, has to wait for being assigned by xv6’s scheduler.


3. Memory Management in Xv6

As most operating systems have memory managers, xv6 operating system also has a memory manager in its kernel. If a process wants to get more memory spaces or returns its memory spaces to the kernel, they need to contact a memory manager. So, to accept a request from processes, a memory manager always manages used and free memory spaces.

In xv6, the memory manager manages its memory space with a linked list called freelist that contains list of memory area. The unit of its page size is 4KB because x86 segment size granularity is 4KB. So, for example, a page in the freelist is able to have huge size amount of memory spaces in one page, but the least size of a page should be 4KB. The actual structure is as follows:
struct run {  struct run *next;int len; // bytes;} struct run *freelist;
So, when xv6‘s user processes want to request more memory space for doing something, they can get it by calling kalloc() function. Like the scheduling algorithm in xv6, the memory manager seeks the free list incrementally, and if it detects the page that is large enough to accept the request, it returns a kernel-segment pointer. If not, it return 0. 

Similarly, if a process needed to release memory space, they can return it by calling kfree() function. But the kfree() runs a little more complex than that the kalloc() runs. When kfree() is called, not only the len bytes of memory spaces are freed, but also the memory manager sorts freelist, meaning that if there are free, adjacent pages, they are combined into one long page.

Obviously, this memory-allocating algorithm is too simple to adapt real world. But it matches xv6’s concept, an operating system as teaching material.


4. File System

As almost all of operating systems have file systems to maintain data consistently,  xv6 also has its simple file system. So, this chapter introduces how the file system in xv6 works  in terms of two sections: file system data, and file system call.

File System Data

In xv6, the file system composes of the following four elements:

1. Block Allocator 
2. I-nodes 
3. Directories 
4. Path names

First, the block allocator is responsible for managing disk blocks, and keeping track of which blocks are in use. It functions as a memory allocator does. So, if balloc() function is called, the allocator finds and returns free blocks, and then, some data is allocated at the returned new disk block. Also, it can release blocks by calling bfree(). One of the signature has a block to be freed, so if it is called, the given block will be released. So, the block allocator in xv6 can manage disk blocks.

Second element is an i-node, which refer to am unnamed file in the file system. Precisely, it is managed by several functions. For example, when allocating, getting or putting data,  the function of ialloc(), iget(), or iput() is used respectively. To allocate a new i-node, such as creating a file, xv6 will call ialloc() function. The ialloc() runs like balloc(), allocating a new i-node with the given type on device. In iget(), it finds the i-node by a given i-node number and returns its copy. And the xv6 can release a reference to an i-node by calling iput(). With such functions, xv6 can manage i-nodes.

The third is directories, which are special kind of i-nodes whose content is a sequence of directory entries. Each entry is a structure in which there are data of name and i-node number. So, for searching a directory, we can find it by calling dirlookup() with the name. Also, for writing, dirlink() function is provided. It enables us to write a new entry with the given name and i-node number into a directory. With those functions, the concept of directory is implemented in xv6.

Finally, there is a path name element, which serves as providing convenient syntax to identify particular files or directories. For example, you can directly refer to a file with the path name, “/xv6/fs.c”, by using the path name. In the case, xv6 first looks up root directory because it begins with a slash, and then, look up “xv6” directory. If the directory exits,  then starts to refer to “fs.c” file. If the path name does not begin with a slash, xv6 assumes that the path is a relative path, not absolute path.

With those elements, xv6 implements the file system. So, the xv6 file system runs in almost same way that the current unix and unix-like operating system runs. But the algorithms are totally naive. The functions to seek something such as balloc() and dirlookup(), xv6 uses a linear seeking, in order to fit the concept of xv6. So, the file system behaves almost same as other operating systems, but it is not efficient.

File System Call

We learned how the data structure in the xv6 file system works in the previous section, but not how the system calls in xv6 are related. So, this section briefly introduces which system calls are used to implement the file system. 

The following list shows some of the system calls and its brief behavior mainly used in xv6 file system.


In addition, there are some helper functions for system calls, which include argint(), argstr(), argptr(), and argfd(). They function as making a good abstraction and serve as connectors between lower level functions to upper level one. And they run in the same way as modern systems have, except that its algorithm is much simpler than the modern systems.



5. Input / Output


We described how the file system is structured and which system calls are used to access data in the previous chapter. Since this chapter, we will focus on how input and output works; specifically, how a buffer cache works in xv6.

Overview

Xv6 has a buffer cache organization to read data from and to write data into disks. This buffer cache functions as managing data temporarily before reading or writing in order to control data exclusively and increase performance. So, if two processes access the same data simultaneously, the organization can serve as serializing it. Also, it is beneficial for reducing the number of accessing physical devices, which is considered as one of the highest cost operations.

Data Structure

How does xv6 implement the buffer cache? Xv6 implements it by using a structure data called “buf”. The buf has three fields: dev, sector, and data. Dev specifies which disk device is referred, sector specifies which sector in a disk is referred, and data is a memory space for copying data from the sector. So, each buf corresponds to a sector in a disk. 

The point of this data structure is that how it collaborates with other processes for exclusive control. For collaborating, the data field in the structure data has flags to maintain a status of read and write by B_VALID, B_DIRTY, and B_BUSY. B_VALID flag determines if data is already read, B_DIRTY determines if the data need to write out, and B_BUSY is used for exclusive control. For example, if a process is using a buf, the B_BUSY flag turns on in order to avoid the situation that the other process uses the buf.

Disk Driver

To access disks, an interface between operating systems and hardwares is required. Currently, there are several kinds of interface to access disks, such as SCSI and SATA, but xv6 uses IDE controller as an interface, because it is a little older but very simple. The actual steps to boot IDE device is as follows:
1. Control interrupts:
      • Initializes locks and forbids hardware interrupts.
      • Allows uni- and multi-processor to interrupt.
2. Boot and wait:
      • Boot IDE controller
      • Do polling status bits until IDE controller becomes ready
3. Check disk status:
      • Choose each disk incrementally
      • Check if status bits are changed
      • If not changed, recognize that the disk is absent

So, through these steps above, xv6 boots the IDE and control disks.

Now we understand how to boot the disk interface, IDE controller. So, let us start understand how it controls disk accesses with buffers. In xv6, the requests of reading and writing data to disks are managed by a queue. If buffers in a queue all finish, the controller notifies it with an hardware interrupt. Specifically, the buffer is added into a queue by calling iderw(), and the first buffer in the queue is started to read or write by calling idestart(), while other buffers in the queue wait for finishing the previous buffer. 

The idestart() changes its behavior by the status of flags; if the write bit is on, it writes data, and then notify that it finished by calling ideintr(). If the read bit is on, with the sleep() system call, it waits for the notification that the disks are ready for being read, and then read data. 

Interrupts and Locks

What happens if an interrupt occurs when reading or writing data? Obviously, the consistency of disk file system will be broken because the disk access cannot be stopped immediately. To avoid such situations, xv6 has a special lock called idklock; xv6 acquires the lock by calling acquire(), and releases it by release(). So, with the lock, xv6’s writing and reading operations can be synchronized.

Like general operating systems, because the idelock is used, we should avoid race condition when acquiring and releasing the lock. So, xv6 creates critical regions by adding pushcli() or popcli() operation to forbid interrupts. Specifically, pushcli() is inserted before acquire(), and popcli() is inserted after release(). Of course, the interrupts can be nested, so the puchcli() and popcli() operations can be stacked in xv6.

Buffer Cache

Now we grasp the idea about how buffer cache works to read and write data. But why does the buffer cache serve as serializing accesses? It is obvious that we need to establish the situation that only one kernel process can edit file system’s data, because if not, the file system will not be consistent and the data will be corrupted. To allow only one process to read and write, bread() is used when using buffer caches to block processes. For example, if two processes are about to call bread(), one process gets the buffer successfully, and the other process will wait for that the first process finished its operation.

The bread() function calls two helper functions; first, call bget() to obtain a buffer for a given sector, and then, call iderw() to add the buffer into a queue in order to read or write disks. When bget() function is called, it starts to scan buffers incrementally, and if the buffer for a given sector is found, it tries to acquire a lock. If a lock is acquired successfully, bget() sets B_BUSY flag true, and return the buffer. For exceptions, if cannot acquire the lock, sleep until the lock is released. Also, if not found the targeted buffer, create a new buffer for that section, and if there is a buffer never used recently, reuse the buffer as a new buffer. In other words, there are no buffers in the list at the beginning, but each buffer is created based on needs. Once created, it is remained in the list, or reused as a new buffer.

It is common that general operating systems have the list of buffers for input and output, but in xv6, the list is structured based on LRU algorithm. The reason why xv6 uses LRU algorithm is that it is simple to scan for two meanings. If the obtained buffer is used successfully, the buffer moves its entry in the list to the first entry, because xv6 assumes that recently used buffer will be used again in the near future. Also, if looking for buffers never used recently, xv6 can scan the list by the reverse order, because the reverse order of the list means the order of buffer recently not used. So, the LRU algorithm fits to the concept of xv6.


6. Summary

In this case study, we investigated the background and concept of xv6 first, and then showed how xv6 runs. Overall, because of its concept, xv6 would be too simple as an operating system; however, it is implemented by the understandable codes and algorithms and they are essences modern operating systems. Also, the lack of functions to be an real operating system could be good assignments for learners to implement. For those reasons, xv6 is becoming famous as a new teaching operation system, and becoming widely used in operating system classes.



7. References


1. Xv6: a teaching operating system (revision 3)

2. The MINIX 3 Operating System

3. Haribote OS

ホイッスルアプリがAppleからRejectされたので反論したが、再びRejectされるまでの経緯





追記:NGな内容が入っていたので、一部修正して再投稿しました。

エンジニアとしての生き方という本を読んで、自分も思考をまとめるためにブログの記事を書いてみようと思い立ち、早速、実験的に自分の思考をまとめてみたいと思います。今回まとめたい内容は、Android Marketにおいて1週間で10,000ダウンロード、一ヶ月で35,000ダウンロードを達成したホイッスル on Androidが、AppleにはRejectされてしまったよ、という話についてです。「こんなアプリはRejectされるよ」という1つの例として参考にしてもらえれば幸いです。

3月13日、かくかくしかじかの背景があって、友人らと共にホイッスルアプリのAndroid版とiPhone版を開発し、それぞれのMarketで同時に投稿しました。Android版のアプリは、特に問題なくAndroid Marketにアップロードされ、各レビューサイトやブログなどで徐々に注目を集めていき、最終的にはDoCoMo Smart-phone Loungeに展示されたり、Mobile ASCIIに載ったりするまでに至りました。


それなりに盛況だったAndroid版ホイッスルアプリでしたが、一方で、iPhone版の方はどうなったのかというと、App Storeにアプリを投稿した約1週間後に、AppleにRejectされてしまいました。

Rejectされた理由については、指摘してもらった項目がいくつかありますが、ポイントをまとめると、こういう意味になると僕は解釈しました。

   理由1. ホイッスルアプリはあまり役に立ちそうではない.
   理由2. ホイッスルアプリの価値は長く続かない(?).
   →したがって、App Storeには載せられない。

確かに技術的にはとても簡単なので誰でも作れそうなアプリではありますが、価値が長く続かなく、役に立たないかと言われれば、「本当に?」という疑問がありました。というのも、事実として、Android版のアプリではそれなりに人気があり、今もなおダウンロード数は伸びているからです。なので、次の論拠を元に、Appleの主張した理由1.に対して反論を試みてみました(理由2.については、文脈からそこまで重要ではないと判断したため、理由1.に対して重点的に反論)。

   1.1. ホイッスルは緊急時の生存確率を上げる
          (cf. 登山時の緊急用ホイッスルの存在).
   1.2. ホイッスルは常に持ち歩かないし, 緊急時に持っていないかもしれないが,
          iPhoneは常に携帯している→いざという時に使える可能性が高い.
          →ホイッスルをiPhoneアプリにする必要性
   2. 素早く簡単に音が鳴らせること, また, 災害時は節電が
     (ネット接続を試みたり, 重い処理をしないことが)重要であること.
       →機能をシンプルにした正当性
   3. Android版では(当時)10,000ダウンロード/週だったこと,
       かつ, アプリに対して肯定的なレビューがほとんどであったこと。
       →ユーザーがusefulだと思っている証拠。
  →したがって、ホイッスルアプリは役に立つ。

これらの論拠を元にAppleに問い合わせをし、さらに待つ事2週間。Appleから来た返事は、審査委員会によって再びRejectされた、という内容でした。詳細は省きますが、僕なりに意訳すると「ホイッスルアプリに簡単にアクセス出来るようにする理由は理解出来ました。しかし、もう少し追加機能があると、より良くなると思います。」といった理由で、ホイッスルアプリの承認はやはり出来ないそうです。

ちなみに、ここでいう"追加機能"とは、フラッシュライト、緊急用アイテム一覧、避難場所案内などの機能を指しているようです。しかし、これらの機能を実装しても承認は保証できないそうです(これは、審査した人と提案した人が違うので当たり前ですね)。

少々長くなってしまいましたが、こういった経緯があってホイッスルアプリのiPhone版はApp Storeには未だありません。今後、Appleが提案したような機能を実装して再びsubmitするかもしれませんが、現段階では未定です。すみません。仮にsubmitするとしても、少なくとも「ホイッスルアプリ」という名前ではないと思います。

最後に、今回の件で得た知識を僕なりにまとめると、

   1. 論理的な思考だけでは、Appleを説得出来ないかもしれない。
   2. シンプルすぎるor技術的に簡単すぎる機能は、審査を通らないっぽい。
   3. 問い合わせの返事は、1週間〜2週間の単位で見積もりましょう。

といった感じです。何か他に気付いたことがあれば、コメント等でお知らせしてもらえると嬉しいです。

うん、結構思考がまとまったような気がする。あと、こんな記事を書いてしまうと勘違いされそうなので、少し断っておきますが、僕はApple好きのマカーです。友人に勧めるLaptopは常にMacBookで、"Besides Mac!"と友人に言われたら、"Okay, then there is nothing I can recommend."と返してしまう人間です。

2011-04-14

Titanium Studio:3日で出来るiPhoneアプリ開発環境

噂のTitanium Studioを使って、iPhoneアプリ開発経験の無い僕が3日(金、土、日曜)で出来るところまでやってみました。作った物はImage-based Word Listといって、英語初心者および初級者向けの英単語学習アプリです。



このアプリは、与えられた英単語から、その英単語に関連する画像(flickr API)と定義(Dictionary.com API)、および例文をWeb上から取って来ます(Google Search API)。英単語を学習するモード(Learn)と、復習するモード(Review)、テストするモード(Quiz)の3つがあり、これらのモードを使い分けて、英単語を学習していくことが出来ます。

似たようなコンセプトのアプリとして、LinkedWordという辞書アプリがありますが、僕の作った物は辞書ではなく単語帳アプリです。なので、APIは同じか、もしくは似ている物を使っているはずですが、表示の仕方が異なります。

具体的な動作については、DEMO動画を取ったので、それを見て下さい。バグやら何やらも写っていますが、良いところだけ写すのもどうかと思ったので、取り直しはしていません。



Titanium Studioを使ってみた感想ですが、付属のKichenSinkというサンプルアプリ集がかなり秀逸で、それを参考にして作っていけば、かなりサクサクとアプリを作っていくことができます。また、デフォルトで背景が黒色、文字が白色なのも素晴らしいです。それと、未だ試していまませんが、CUIでも動くようです。これでオープンソースというのだから、驚きです。

TitaniumではAndroidアプリも作れるようなので、今度はアンドロイドアプリをTitanium Studioで作ってみようかと思います。今のところ、ホイッスル on Androidのウィジェット版が欲しいという声をちらほら聞くので、今週末は、その実装をTitaniumでやろうかと考えてます。

Image-based Word Listのソースコード
https://github.com/yasulab/Image-based-Word-List

なお、自作API用に使っている自サーバのメンテナンスコストが大きいので、Image-based Word Listアプリを公開する予定はありません。また、自サーバが動いていないと、WebView(画面下の表示領域)に何も表示されなくなります。

// もし自作APIをGoogle App Engineなどにうまく移植出来たら、公開するかもしれません。

How to run your iPhone App on Actual Device with Titanium Studio

iPhoneアプリ開発の未経験者の僕が、Titanium Studioで実機テストをしようとして少し躓いたので、ポイントをまとめておきます。

0. 実機テストに使用するiPhoneアプリを開発。

1. 次のページを参考にProvisioning Profileを作成する。

第9回 デバイスでアプリを動かす

2. Titanium Studioを開いて、左のペインにある右矢印アイコンをクリックし、iOSデバイスを選択



3. 1.で取得したProvisioning Profileをアップロードする。

4. Finishをクリック。(クリックすると、実機へのインストールが始まります。)

ちなみに、Xcodeを開いた状態で実機へのインストールを開始するとエラーが起こります。一度エラーが起こると、たとえXcodeを閉じて再び実機へインストールしようとしても、同じ場所でエラーが起こります。このときは、キャッシュをクリアしてあげれば良いです。

具体的には、buildファイル内のファイルを全て削除すれば良いです。ここにあるファイルはbuildする度に自動生成されるので、削除しても問題ありません。



Good luck!


2011-04-12

Text Analyzer



AIの課題を解く際に出来た副作用的な作品。とりあえず公開。もしかしたらiPhoneアプリでも使えるやもしれない。

GitHub - yasulab/Text-Analyzer


/Users/yohei/text-analyzer% python text-analyzer.py
Text Analyzer
-------------
Commands:
rand NUMBER   : Randomly pick a word given times.
read FILENAME : Read a given file.
stats         : Show stats of recently read text.
help          : Show this.
exit          : Exit.

Current Text:  alice.txt

[alice.txt]> stats
Number of words: 27421

TOP 10 - MOST FRQUENTLY USED WORDS
COUNT | WORD | PROB [%]
1636 | the | 5.97
870 | and | 3.17
729 | to | 2.66
627 | a | 2.29
595 | it | 2.17
553 | she | 2.02
544 | i | 1.98
515 | of | 1.88
462 | said | 1.68
411 | you | 1.50

[alice.txt]> stata
Unknown command: stata
[alice.txt]> rands 10
Unknown command: rands
[alice.txt]> rand 10
Number of words: 27421

TOP 10 - MOST FRQUENTLY USED WORDS
COUNT | WORD | PROB [%]
1 | with | 10.00
1 | while | 10.00
1 | trotting | 10.00
1 | to | 10.00
1 | the | 10.00
1 | she | 10.00
1 | replied | 10.00
1 | isn | 10.00
1 | heard | 10.00
1 | after | 10.00

[alice.txt]> rand 2
Number of words: 27421

TOP 10 - MOST FRQUENTLY USED WORDS
COUNT | WORD | PROB [%]
1 | looked | 50.00
1 | come | 50.00

[alice.txt]> rand 100
Number of words: 27421

TOP 10 - MOST FRQUENTLY USED WORDS
COUNT | WORD | PROB [%]
9 | the | 9.00
5 | to | 5.00
4 | and | 4.00
2 | you | 2.00
2 | thought | 2.00
2 | s | 2.00
2 | must | 2.00
2 | as | 2.00
2 | all | 2.00
2 | alice | 2.00

[alice.txt]> read vicar.txt
[vicar.txt]> stats
Number of words: 64244

TOP 10 - MOST FRQUENTLY USED WORDS
COUNT | WORD | PROB [%]
2890 | the | 4.50
2093 | to | 3.26
1796 | and | 2.80
1659 | of | 2.58
1581 | i | 2.46
1317 | a | 2.05
1219 | my | 1.90
948 | in | 1.48
930 | that | 1.45
873 | was | 1.36

[vicar.txt]> help
Commands:
rand NUMBER   : Randomly pick a word given times.
read FILENAME : Read a given file.
stats         : Show stats of recently read text.
help          : Show this.
exit          : Exit.

[vicar.txt]> rand 100
Number of words: 64244

TOP 10 - MOST FRQUENTLY USED WORDS
COUNT | WORD | PROB [%]
5 | and | 5.00
3 | for | 3.00
3 | be | 3.00
2 | went | 2.00
2 | we | 2.00
2 | to | 2.00
2 | time | 2.00
2 | the | 2.00
2 | my | 2.00
2 | is | 2.00

[vicar.txt]> stats
Number of words: 64244

TOP 10 - MOST FRQUENTLY USED WORDS
COUNT | WORD | PROB [%]
2890 | the | 4.50
2093 | to | 3.26
1796 | and | 2.80
1659 | of | 2.58
1581 | i | 2.46
1317 | a | 2.05
1219 | my | 1.90
948 | in | 1.48
930 | that | 1.45
873 | was | 1.36

[vicar.txt]> help
Commands:
rand NUMBER   : Randomly pick a word given times.
read FILENAME : Read a given file.
stats         : Show stats of recently read text.
help          : Show this.
exit          : Exit.

[vicar.txt]> exit

How to Use Custom Font in Titanium Studio



ちょっとやり方が特殊だったのでメモしておきます。

Titanium Studioのカスタムフォントの使い方:

1. http://www.fontsquirrel.com/などでフォントファイルを取得&解凍する。

2. 解凍したフォントファイルを/PATH/TO/PROJECT/Resources/に置く。

3. /PATH/TO/PROJECT/build/iPhone/Info.plistをコピーして、

4. /PATH/TO/PROJECT/にコピーしたInfo.plistをペーストする。

5. /PATH/TO/PROJECT/build/iPhone/以下のすべてのファイルを削除。

# 再ビルドでまた自動生成されるので気にしない。
# キャッシュをクリアするようなものです。

6. /PATH/TO/PROJECT/Info.plistを開く。

7. Add Itemをクリックし、”UIAppFonts”と入力。

8. 生成された"UIAppFonts"を右クリック->Value Type->Arrayを選択。

9. "UIAppFonts"の直前にある矢印アイコンをクリックし(フォーカスを移し)、その後同じアイコンを右クリック。

10. Add Rowを選択。

11. Item 0 が生成されるので、そのValueにフォントファイル名(e.g. "comic_zine_ot.otf")を入力。

12. Titanium Studio内の任意のコンポーネントでカスタムフォントを使用してみる。

例:
var customFontLabel = Ti.UI.createLabel(
{
     text:"Custom Font Test",
     top: 0,
     font:{
          fontSize:60,
          fontFamily:"Comic Zine OT"
     },
     width:"auto",
     textAlign:"center"
});
Titanium.UI.currentWindow.add(customFontLabel);

13. ビルド&実行してみて、うまく表示されれば成功です。

もしどこかでつまづいたら、次のサイトの動画を参考にすればよいかと思います。

Titanium Mobile: Displaying Custom Fonts

2011-04-11

Image-based Word List on iPhone with Titanium

昨日、偶然Titaniumに出会ってしまったので、即興でちょっとしたiPhoneアプリを作ってみました。flickr APIと自作API(例文検索API)を使ってますが、まぁ、あんまり期待したほど精度は良くないですね。しかもflickrのAPIはエラーが起きる始末w。


2011-04-10

Simple Tag Getter with lxml




所用で英文テキストが欲しかったので、lxmlで英文テキストを取ってくるスクリプトを作成。汎用性はまったく無いけど、とりあえず公開。

#!/usr/bin/env python
#! -*- coding: utf-8 -*-

import sys
import urllib2
import lxml.html
        
def get_texts(url, tag):
    try:
        html = urllib2.urlopen(url).read()
    except urllib2.HTTPError, e:
        # If returned 404
        #print e.read()
        print "Can't access to the given URL."
        return
        
    root = lxml.html.fromstring(html)
    anchors = root.xpath(tag)

    text_list = []
    for a in anchors:
        text = lxml.html.tostring(a, method='text', encoding='utf-8')
        #print text.strip('\t\n')
        text_list.append(text.strip('\t\n'))
    if not text_list:
        text = "There are no tags in this page"
        text_list.append(text)
    return text_list

def create_xpath(tag, attr):
    xpath = "//"
    xpath += tag
    if not attr == "":
        xpath += "[@"
        attr,value = attr.split("=")
        xpath += attr
        xpath += "=\"" + value + "\"]"
    print "Created Xpath: " + xpath
    return xpath

if __name__ == "__main__":
    argv_len = len(sys.argv)
    if not (argv_len == 3 or argv_len == 4):
        print "Usage: python tag-getter.py URL TAG [ATTR]"
        exit()
    
    url = sys.argv[1].lower()
    tag = sys.argv[2].lower()
    if argv_len == 4:
        attr = sys.argv[3].lower()
    else:
        attr = ""
    xpath = create_xpath(tag, attr)
    text_list = get_texts(url, xpath)
    for t in text_list:
        print t
    


Usage Example:

  $ python tag-getter.py http://ebooks.adelaide.edu.au/c/carroll/lewis/alice/chapter1.html div class=dochead

Created Xpath: //div[@class="dochead"]
Alice in Wonderland, by Lewis Carroll

  $ python tag-getter.py http://ebooks.adelaide.edu.au/c/carroll/lewis/alice/chapter1.html p

Created Xpath: //p
Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice
she had peeped into the book her sister was reading, but it had no pictures or conversations in it, ‘and what is the use of
a book,’ thought Alice ‘without pictures or conversation?’

... 後略 ...