|
||||||||||
| 前のクラス 次のクラス | フレームあり フレームなし | |||||||||
| 概要: 入れ子 | フィールド | コンストラクタ | メソッド | 詳細: フィールド | コンストラクタ | メソッド | |||||||||
java.lang.Objectnet.java.sen.dictionary.Tokenizer
public abstract class Tokenizer
A String Tokenizer
The Tokenizer uses a Dictionary to assist the decomposition of
strings into potential morphemes
| フィールドの概要 | |
|---|---|
protected Node |
bosNode
A Node representing a beginning-of-string |
protected Dictionary |
dictionary
The Dictionary used to find possible morphemes |
protected Node |
eosNode
A Node representing an end-of-string |
protected CToken |
unknownCToken
A CToken representing an unknown morpheme |
protected java.lang.String |
unknownPartOfSpeechDescription
The part-of-speech code to use for unknown tokens |
| コンストラクタの概要 | |
|---|---|
Tokenizer(Dictionary dictionary,
java.lang.String unknownPartOfSpeechDescription)
Constructs a new Tokenizer that uses the specified
Dictionary to find possible morphemes within a given string |
|
| メソッドの概要 | |
|---|---|
Node |
getBOSNode()
Creates a unique beginning-of-string Node. |
Dictionary |
getDictionary()
|
Node |
getEOSNode()
Creates a unique end-of-string Node. |
Node |
getUnknownNode(char[] surface,
int start,
int length,
int span)
Creates an "unknown morpheme" Node with the specified
characteristics. |
abstract Node |
lookup(SentenceIterator iterator,
char[] surface)
Searches for possible morphemes from the given SentenceIterator. |
| クラス java.lang.Object から継承されたメソッド |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| フィールドの詳細 |
|---|
protected Dictionary dictionary
Dictionary used to find possible morphemes
protected CToken unknownCToken
CToken representing an unknown morpheme
protected Node bosNode
Node representing a beginning-of-string
protected Node eosNode
Node representing an end-of-string
protected java.lang.String unknownPartOfSpeechDescription
| コンストラクタの詳細 |
|---|
public Tokenizer(Dictionary dictionary,
java.lang.String unknownPartOfSpeechDescription)
Tokenizer that uses the specified
Dictionary to find possible morphemes within a given string
dictionary - The Dictionary to search withinunknownPartOfSpeechDescription - The part-of-speech code to use for
unknown tokens| メソッドの詳細 |
|---|
public Dictionary getDictionary()
public Node getBOSNode()
Node. The Node
returned by this method is freshly cloned and not an alias of any
other Node
Nodepublic Node getEOSNode()
Node. The Node returned by
this method is freshly cloned and not an alias of any other Node
public Node getUnknownNode(char[] surface,
int start,
int length,
int span)
Node with the specified
characteristics. The Node returned by this method is freshly
cloned and not an alias of any other Node
surface - The underlying surface of which the Node is partstart - The index of the first character of the surface within the
Nodelength - The length of the Nodespan - The span of the Node
Node
public abstract Node lookup(SentenceIterator iterator,
char[] surface)
throws java.io.IOException
Node that is returned links through
Node.rnext to a list of matches which may be of varying
lengths
iterator - The iterator to search fromsurface - The underlying character surface
Nodes representing the possible
morphemes beginning at the given index
java.io.IOException
|
||||||||||
| 前のクラス 次のクラス | フレームあり フレームなし | |||||||||
| 概要: 入れ子 | フィールド | コンストラクタ | メソッド | 詳細: フィールド | コンストラクタ | メソッド | |||||||||