AI

Mecabの拡張辞書「mecab-ipadic-neologd」を導入する手順

2018年2月27日

こんにちは!土門大貴(daikidomon)です。

Mecabの辞書は古い!ということで、新語や固有名詞に強いとされている辞書「mecab-ipadic-neologd」を導入してみたいと思います。

MecabをまだLinuxにインストールしていない人は↓の記事を確認してください。

LinuxにMeCabをインストールする手順

LinuxにMeCabをインストールする手順を紹介します。 最終的には、Pythonを使用して「マルコフ連鎖」を使用して ...

mecab-ipadic-neologdをダウンロード

まずは「git」を使用して、「mecab-ipadic-neologd」をダウンロードします。
git clone https://github.com/neologd/mecab-ipadic-neologd
↓が実行ログです。
# git clone https://github.com/neologd/mecab-ipadic-neologd
Cloning into 'mecab-ipadic-neologd'...
remote: Counting objects: 5531, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 5531 (delta 0), reused 1 (delta 0), pack-reused 5528
Receiving objects: 100% (5531/5531), 296.54 MiB | 8.37 MiB/s, done.
Resolving deltas: 100% (3340/3340), done.

mecab-ipadic-neologdの初期設定

gitでのダウンロードが完了したら、辞書データをコンパイルします。

なおコンパイル要件として、「メモリ空き容量が1.5GByte」必要とのことのなので注意してください。
cd mecab-ipadic-neologd
./bin/install-mecab-ipadic-neologd -n -a
↓が実行ログです。
# ./bin/install-mecab-ipadic-neologd -n -a
[install-mecab-ipadic-NEologd] : Start..
[install-mecab-ipadic-NEologd] : Check the existance of libraries
[install-mecab-ipadic-NEologd] : find => ok
[install-mecab-ipadic-NEologd] : sort => ok
[install-mecab-ipadic-NEologd] : head => ok
[install-mecab-ipadic-NEologd] : cut => ok
[install-mecab-ipadic-NEologd] : egrep => ok
[install-mecab-ipadic-NEologd] : mecab => ok
[install-mecab-ipadic-NEologd] : mecab-config => ok
[install-mecab-ipadic-NEologd] : make => ok
[install-mecab-ipadic-NEologd] : curl => ok
[install-mecab-ipadic-NEologd] : sed => ok
[install-mecab-ipadic-NEologd] : cat => ok
[install-mecab-ipadic-NEologd] : diff => ok
[install-mecab-ipadic-NEologd] : tar => ok
[install-mecab-ipadic-NEologd] : unxz => ok
[install-mecab-ipadic-NEologd] : xargs => ok
[install-mecab-ipadic-NEologd] : grep => ok
[install-mecab-ipadic-NEologd] : iconv => ok
which: no patch in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/python/bin)
[install-mecab-ipadic-NEologd] : patch is not found.
私の環境の場合、「patch」コマンドが使用出来ないと怒られました。

なので「patch」をインストールして再度実行します。
# ./bin/install-mecab-ipadic-neologd -n -a
[install-mecab-ipadic-NEologd] : Start..
[install-mecab-ipadic-NEologd] : Check the existance of libraries
[install-mecab-ipadic-NEologd] : find => ok
[install-mecab-ipadic-NEologd] : sort => ok
[install-mecab-ipadic-NEologd] : head => ok
[install-mecab-ipadic-NEologd] : cut => ok
[install-mecab-ipadic-NEologd] : egrep => ok
[install-mecab-ipadic-NEologd] : mecab => ok
[install-mecab-ipadic-NEologd] : mecab-config => ok
[install-mecab-ipadic-NEologd] : make => ok
[install-mecab-ipadic-NEologd] : curl => ok
[install-mecab-ipadic-NEologd] : sed => ok
[install-mecab-ipadic-NEologd] : cat => ok
[install-mecab-ipadic-NEologd] : diff => ok
[install-mecab-ipadic-NEologd] : tar => ok
[install-mecab-ipadic-NEologd] : unxz => ok
[install-mecab-ipadic-NEologd] : xargs => ok
[install-mecab-ipadic-NEologd] : grep => ok
[install-mecab-ipadic-NEologd] : iconv => ok
[install-mecab-ipadic-NEologd] : patch => ok
[install-mecab-ipadic-NEologd] : which => ok
[install-mecab-ipadic-NEologd] : file => ok
[install-mecab-ipadic-NEologd] : openssl => ok
[install-mecab-ipadic-NEologd] : awk => ok

[install-mecab-ipadic-NEologd] : mecab-ipadic-NEologd is already up-to-date

[install-mecab-ipadic-NEologd] : mecab-ipadic-NEologd will be install to /usr/local/lib/mecab/dic/mecab-ipadic-neologd

[install-mecab-ipadic-NEologd] : Make mecab-ipadic-NEologd
[make-mecab-ipadic-NEologd] : Start..
[make-mecab-ipadic-NEologd] : Check local seed directory
[make-mecab-ipadic-NEologd] : Check local seed file
[make-mecab-ipadic-NEologd] : Check local build directory
[make-mecab-ipadic-NEologd] : create /usr/local/src/mecab-ipadic-neologd/libexec/../build
[make-mecab-ipadic-NEologd] : Download original mecab-ipadic file
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 383 0 383 0 0 160 0 --:--:-- 0:00:02 --:--:-- 160
0 0 0 11.6M 0 0 2566k 0 --:--:-- 0:00:04 --:--:-- 8053k
[make-mecab-ipadic-NEologd] : Decompress original mecab-ipadic file
mecab-ipadic-2.7.0-20070801/
mecab-ipadic-2.7.0-20070801/README
mecab-ipadic-2.7.0-20070801/AUTHORS
mecab-ipadic-2.7.0-20070801/COPYING
mecab-ipadic-2.7.0-20070801/ChangeLog
mecab-ipadic-2.7.0-20070801/INSTALL
mecab-ipadic-2.7.0-20070801/Makefile.am
mecab-ipadic-2.7.0-20070801/Makefile.in
mecab-ipadic-2.7.0-20070801/NEWS
mecab-ipadic-2.7.0-20070801/aclocal.m4
mecab-ipadic-2.7.0-20070801/config.guess
mecab-ipadic-2.7.0-20070801/config.sub
mecab-ipadic-2.7.0-20070801/configure
mecab-ipadic-2.7.0-20070801/configure.in
mecab-ipadic-2.7.0-20070801/install-sh
mecab-ipadic-2.7.0-20070801/missing
mecab-ipadic-2.7.0-20070801/mkinstalldirs
mecab-ipadic-2.7.0-20070801/Adj.csv
mecab-ipadic-2.7.0-20070801/Adnominal.csv
mecab-ipadic-2.7.0-20070801/Adverb.csv
mecab-ipadic-2.7.0-20070801/Auxil.csv
mecab-ipadic-2.7.0-20070801/Conjunction.csv
mecab-ipadic-2.7.0-20070801/Filler.csv
mecab-ipadic-2.7.0-20070801/Interjection.csv
mecab-ipadic-2.7.0-20070801/Noun.adjv.csv
mecab-ipadic-2.7.0-20070801/Noun.adverbal.csv
mecab-ipadic-2.7.0-20070801/Noun.csv
mecab-ipadic-2.7.0-20070801/Noun.demonst.csv
mecab-ipadic-2.7.0-20070801/Noun.nai.csv
mecab-ipadic-2.7.0-20070801/Noun.name.csv
mecab-ipadic-2.7.0-20070801/Noun.number.csv
mecab-ipadic-2.7.0-20070801/Noun.org.csv
mecab-ipadic-2.7.0-20070801/Noun.others.csv
mecab-ipadic-2.7.0-20070801/Noun.place.csv
mecab-ipadic-2.7.0-20070801/Noun.proper.csv
mecab-ipadic-2.7.0-20070801/Noun.verbal.csv
mecab-ipadic-2.7.0-20070801/Others.csv
mecab-ipadic-2.7.0-20070801/Postp-col.csv
mecab-ipadic-2.7.0-20070801/Postp.csv
mecab-ipadic-2.7.0-20070801/Prefix.csv
mecab-ipadic-2.7.0-20070801/Suffix.csv
mecab-ipadic-2.7.0-20070801/Symbol.csv
mecab-ipadic-2.7.0-20070801/Verb.csv
mecab-ipadic-2.7.0-20070801/char.def
mecab-ipadic-2.7.0-20070801/feature.def
mecab-ipadic-2.7.0-20070801/left-id.def
mecab-ipadic-2.7.0-20070801/matrix.def
mecab-ipadic-2.7.0-20070801/pos-id.def
mecab-ipadic-2.7.0-20070801/rewrite.def
mecab-ipadic-2.7.0-20070801/right-id.def
mecab-ipadic-2.7.0-20070801/unk.def
mecab-ipadic-2.7.0-20070801/dicrc
mecab-ipadic-2.7.0-20070801/RESULT
[make-mecab-ipadic-NEologd] : Configure custom system dictionary on /usr/local/src/mecab-ipadic-neologd/libexec/../build/mecab-ipadic-2.7.0-20070801-neologd-20180219
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking whether make sets $(MAKE)... yes
checking for working aclocal-1.4... missing
checking for working autoconf... missing
checking for working automake-1.4... missing
checking for working autoheader... missing
checking for working makeinfo... missing
checking for a BSD-compatible install... /usr/bin/install -c
checking for mecab-config... /usr/local/bin/mecab-config
configure: creating ./config.status
config.status: creating Makefile
[make-mecab-ipadic-NEologd] : Encode the character encoding of system dictionary resources from EUC_JP to UTF-8
./../../libexec/iconv_euc_to_utf8.sh ./Adj.csv
./../../libexec/iconv_euc_to_utf8.sh ./Adnominal.csv
./../../libexec/iconv_euc_to_utf8.sh ./Adverb.csv
./../../libexec/iconv_euc_to_utf8.sh ./Auxil.csv
./../../libexec/iconv_euc_to_utf8.sh ./Conjunction.csv
./../../libexec/iconv_euc_to_utf8.sh ./Filler.csv
./../../libexec/iconv_euc_to_utf8.sh ./Interjection.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.adjv.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.adverbal.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.demonst.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.nai.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.name.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.number.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.org.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.others.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.place.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.proper.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.verbal.csv
./../../libexec/iconv_euc_to_utf8.sh ./Others.csv
./../../libexec/iconv_euc_to_utf8.sh ./Postp-col.csv
./../../libexec/iconv_euc_to_utf8.sh ./Postp.csv
./../../libexec/iconv_euc_to_utf8.sh ./Prefix.csv
./../../libexec/iconv_euc_to_utf8.sh ./Suffix.csv
./../../libexec/iconv_euc_to_utf8.sh ./Symbol.csv
./../../libexec/iconv_euc_to_utf8.sh ./Verb.csv
rm ./Adj.csv
rm ./Adnominal.csv
rm ./Adverb.csv
rm ./Auxil.csv
rm ./Conjunction.csv
rm ./Filler.csv
rm ./Interjection.csv
rm ./Noun.adjv.csv
rm ./Noun.adverbal.csv
rm ./Noun.csv
rm ./Noun.demonst.csv
rm ./Noun.nai.csv
rm ./Noun.name.csv
rm ./Noun.number.csv
rm ./Noun.org.csv
rm ./Noun.others.csv
rm ./Noun.place.csv
rm ./Noun.proper.csv
rm ./Noun.verbal.csv
rm ./Others.csv
rm ./Postp-col.csv
rm ./Postp.csv
rm ./Prefix.csv
rm ./Suffix.csv
rm ./Symbol.csv
rm ./Verb.csv
./../../libexec/iconv_euc_to_utf8.sh ./char.def
./../../libexec/iconv_euc_to_utf8.sh ./feature.def
./../../libexec/iconv_euc_to_utf8.sh ./left-id.def
./../../libexec/iconv_euc_to_utf8.sh ./matrix.def
./../../libexec/iconv_euc_to_utf8.sh ./pos-id.def
./../../libexec/iconv_euc_to_utf8.sh ./rewrite.def
./../../libexec/iconv_euc_to_utf8.sh ./right-id.def
./../../libexec/iconv_euc_to_utf8.sh ./unk.def
rm ./char.def
rm ./feature.def
rm ./left-id.def
rm ./matrix.def
rm ./pos-id.def
rm ./rewrite.def
rm ./right-id.def
rm ./unk.def
mv ./Adj.csv.utf8 ./Adj.csv
mv ./Adnominal.csv.utf8 ./Adnominal.csv
mv ./Adverb.csv.utf8 ./Adverb.csv
mv ./Auxil.csv.utf8 ./Auxil.csv
mv ./Conjunction.csv.utf8 ./Conjunction.csv
mv ./Filler.csv.utf8 ./Filler.csv
mv ./Interjection.csv.utf8 ./Interjection.csv
mv ./Noun.adjv.csv.utf8 ./Noun.adjv.csv
mv ./Noun.adverbal.csv.utf8 ./Noun.adverbal.csv
mv ./Noun.csv.utf8 ./Noun.csv
mv ./Noun.demonst.csv.utf8 ./Noun.demonst.csv
mv ./Noun.nai.csv.utf8 ./Noun.nai.csv
mv ./Noun.name.csv.utf8 ./Noun.name.csv
mv ./Noun.number.csv.utf8 ./Noun.number.csv
mv ./Noun.org.csv.utf8 ./Noun.org.csv
mv ./Noun.others.csv.utf8 ./Noun.others.csv
mv ./Noun.place.csv.utf8 ./Noun.place.csv
mv ./Noun.proper.csv.utf8 ./Noun.proper.csv
mv ./Noun.verbal.csv.utf8 ./Noun.verbal.csv
mv ./Others.csv.utf8 ./Others.csv
mv ./Postp-col.csv.utf8 ./Postp-col.csv
mv ./Postp.csv.utf8 ./Postp.csv
mv ./Prefix.csv.utf8 ./Prefix.csv
mv ./Suffix.csv.utf8 ./Suffix.csv
mv ./Symbol.csv.utf8 ./Symbol.csv
mv ./Verb.csv.utf8 ./Verb.csv
mv ./char.def.utf8 ./char.def
mv ./feature.def.utf8 ./feature.def
mv ./left-id.def.utf8 ./left-id.def
mv ./matrix.def.utf8 ./matrix.def
mv ./pos-id.def.utf8 ./pos-id.def
mv ./rewrite.def.utf8 ./rewrite.def
mv ./right-id.def.utf8 ./right-id.def
mv ./unk.def.utf8 ./unk.def
[make-mecab-ipadic-NEologd] : Fix yomigana field of IPA dictionary
patching file Noun.csv
patching file Noun.place.csv
patching file Verb.csv
patching file Noun.verbal.csv
patching file Noun.name.csv
patching file Noun.adverbal.csv
patching file Noun.csv
patching file Noun.name.csv
patching file Noun.org.csv
patching file Noun.others.csv
patching file Noun.place.csv
patching file Noun.proper.csv
patching file Noun.verbal.csv
patching file Prefix.csv
patching file Suffix.csv
patching file Noun.proper.csv
patching file Noun.csv
patching file Noun.name.csv
patching file Noun.org.csv
patching file Noun.place.csv
patching file Noun.proper.csv
patching file Noun.verbal.csv
patching file Noun.name.csv
patching file Noun.org.csv
patching file Noun.place.csv
patching file Noun.proper.csv
patching file Suffix.csv
patching file Noun.demonst.csv
patching file Noun.csv
patching file Noun.name.csv
[make-mecab-ipadic-NEologd] : Copy user dictionary resource
[make-mecab-ipadic-NEologd] : Install adverb entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-adverb-dict-seed.20150623.csv.xz
[make-mecab-ipadic-NEologd] : Install interjection entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-interjection-dict-seed.20170216.csv.xz
[make-mecab-ipadic-NEologd] : Install noun orthographic variant entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-common-noun-ortho-variant-dict-seed.20170228.csv.xz
[make-mecab-ipadic-NEologd] : Install noun orthographic variant entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-proper-noun-ortho-variant-dict-seed.20161110.csv.xz
[make-mecab-ipadic-NEologd] : Install entries of orthographic variant of a noun used as verb form using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-noun-sahen-conn-ortho-variant-dict-seed.20160323.csv.xz
[make-mecab-ipadic-NEologd] : Install frequent adjective orthographic variant entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-adjective-std-dict-seed.20151126.csv.xz
[make-mecab-ipadic-NEologd] : Install infrequent adjective orthographic variant entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-adjective-exp-dict-seed.20151126.csv.xz
[make-mecab-ipadic-NEologd] : Install adjective verb orthographic variant entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-adjective-verb-dict-seed.20160324.csv.xz
[make-mecab-ipadic-NEologd] : Install infrequent datetime representation entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-date-time-infreq-dict-seed.20170224.csv.xz
[make-mecab-ipadic-NEologd] : Install infrequent quantity representation entries using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-quantity-infreq-dict-seed.20170224.csv.xz
[make-mecab-ipadic-NEologd] : Install entries of ill formed words using /usr/local/src/mecab-ipadic-neologd/libexec/../seed/neologd-ill-formed-words-dict-seed.20170127.csv.xz
[make-mecab-ipadic-NEologd] : Re-Index system dictionary
reading ./unk.def ... 40
emitting double-array: 100% |###########################################|
./model.def is not found. skipped.
reading ./Adj.csv ... 27210
reading ./Adnominal.csv ... 135
reading ./Adverb.csv ... 3032
reading ./Auxil.csv ... 199
reading ./Conjunction.csv ... 171
reading ./Filler.csv ... 19
reading ./Interjection.csv ... 252
reading ./Noun.adverbal.csv ... 808
reading ./Noun.demonst.csv ... 120
reading ./Noun.number.csv ... 42
reading ./Noun.others.csv ... 153
reading ./Noun.proper.csv ... 27493
reading ./neologd-interjection-dict-seed.20170216.csv ... 4701
reading ./neologd-adjective-exp-dict-seed.20151126.csv ... 1051146
reading ./neologd-adjective-verb-dict-seed.20160324.csv ... 20268
reading ./Noun.verbal.csv ... 12150
reading ./Others.csv ... 2
reading ./Postp-col.csv ... 91
reading ./Postp.csv ... 146
reading ./Prefix.csv ... 224
reading ./Suffix.csv ... 1448
reading ./Symbol.csv ... 208
reading ./Verb.csv ... 130750
reading ./mecab-user-dict-seed.20180219.csv ... 3103805
reading ./neologd-proper-noun-ortho-variant-dict-seed.20161110.csv ... 138379
reading ./neologd-noun-sahen-conn-ortho-variant-dict-seed.20160323.csv ... 26058
reading ./neologd-date-time-infreq-dict-seed.20170224.csv ... 16533
reading ./neologd-ill-formed-words-dict-seed.20170127.csv ... 60616
reading ./Noun.adjv.csv ... 3328
reading ./Noun.csv ... 60734
reading ./Noun.nai.csv ... 42
reading ./Noun.name.csv ... 34215
reading ./Noun.org.csv ... 17149
reading ./Noun.place.csv ... 73194
reading ./neologd-adverb-dict-seed.20150623.csv ... 139792
reading ./neologd-common-noun-ortho-variant-dict-seed.20170228.csv ... 152869
reading ./neologd-adjective-std-dict-seed.20151126.csv ... 507812
reading ./neologd-quantity-infreq-dict-seed.20170224.csv ... 205926
emitting double-array: 100% |###########################################|
reading ./matrix.def ... 1316x1316
emitting matrix : 100% |###########################################|

done!
[make-mecab-ipadic-NEologd] : Make custom system dictionary on /usr/local/src/mecab-ipadic-neologd/libexec/../build/mecab-ipadic-2.7.0-20070801-neologd-20180219
make: `all' に対して行うべき事はありません.
[make-mecab-ipadic-NEologd] : Finish..
[install-mecab-ipadic-NEologd] : Get results of tokenize test
[test-mecab-ipadic-NEologd] : Start..
[test-mecab-ipadic-NEologd] : Replace timestamp from 'git clone' date to 'git commit' date
[test-mecab-ipadic-NEologd] : Get buzz phrases
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 107k 0 107k 0 0 106k 0 --:--:-- 0:00:01 --:--:-- 106k
[test-mecab-ipadic-NEologd] : Get difference between default system dictionary and mecab-ipadic-NEologd
[test-mecab-ipadic-NEologd] : Tokenize phrase using default system dictionary
[test-mecab-ipadic-NEologd] : Tokenize phrase using mecab-ipadic-NEologd
[test-mecab-ipadic-NEologd] : Get result of diff
[test-mecab-ipadic-NEologd] : Please check difference between default system dictionary and mecab-ipadic-NEologd

default system dictionary | mecab-ipadic-NEologd
緊急 地震 速報 | 緊急地震速報
地震 情報 | 地震情報
マカオバンジー | マカオ バンジー
け や かけ | けやかけ
南海 トラフ | 南海トラフ
強震 モニタ | 強震モニタ
乃木坂 工事 中 | 乃木坂工事中
睡眠 障害 | 睡眠障害
さ ゆ にゃ ん | さゆ にゃん
馬 龍 | 馬龍
三 ツ 星 カラーズ | 三ツ星カラーズ
ハウステンボス 歌劇 学院 | ハウステンボス歌劇学院
森岡 亮太 | 森岡亮太

[test-mecab-ipadic-NEologd] : Finish..

[install-mecab-ipadic-NEologd] : Please check the list of differences in the upper part.

[install-mecab-ipadic-NEologd] : Do you want to install mecab-ipadic-NEologd? Type yes or no.
yes
[install-mecab-ipadic-NEologd] : OK. Let's install mecab-ipadic-NEologd.
[install-mecab-ipadic-NEologd] : Start..
[install-mecab-ipadic-NEologd] : /usr/local/lib/mecab/dic is current user's directory
[install-mecab-ipadic-NEologd] : Make install to /usr/local/lib/mecab/dic/mecab-ipadic-neologd
make[1]: ディレクトリ `/usr/local/src/mecab-ipadic-neologd/build/mecab-ipadic-2.7.0-20070801-neologd-20180219' に入りま す
make[1]: `install-exec-am' に対して行うべき事はありません.
/bin/sh ./mkinstalldirs /usr/local/lib/mecab/dic/mecab-ipadic-neologd
mkdir /usr/local/lib/mecab/dic/mecab-ipadic-neologd
/usr/bin/install -c -m 644 ./matrix.bin /usr/local/lib/mecab/dic/mecab-ipadic-neologd/matrix.bin
/usr/bin/install -c -m 644 ./char.bin /usr/local/lib/mecab/dic/mecab-ipadic-neologd/char.bin
/usr/bin/install -c -m 644 ./sys.dic /usr/local/lib/mecab/dic/mecab-ipadic-neologd/sys.dic
/usr/bin/install -c -m 644 ./unk.dic /usr/local/lib/mecab/dic/mecab-ipadic-neologd/unk.dic
/usr/bin/install -c -m 644 ./left-id.def /usr/local/lib/mecab/dic/mecab-ipadic-neologd/left-id.def
/usr/bin/install -c -m 644 ./right-id.def /usr/local/lib/mecab/dic/mecab-ipadic-neologd/right-id.def
/usr/bin/install -c -m 644 ./rewrite.def /usr/local/lib/mecab/dic/mecab-ipadic-neologd/rewrite.def
/usr/bin/install -c -m 644 ./pos-id.def /usr/local/lib/mecab/dic/mecab-ipadic-neologd/pos-id.def
/usr/bin/install -c -m 644 ./dicrc /usr/local/lib/mecab/dic/mecab-ipadic-neologd/dicrc
make[1]: ディレクトリ `/usr/local/src/mecab-ipadic-neologd/build/mecab-ipadic-2.7.0-20070801-neologd-20180219' から出ま す

[install-mecab-ipadic-NEologd] : Install completed.
[install-mecab-ipadic-NEologd] : When you use MeCab, you can set '/usr/local/lib/mecab/dic/mecab-ipadic-neologd' as a value of '-d' option of MeCab.
[install-mecab-ipadic-NEologd] : Usage of mecab-ipadic-NEologd is here.
Usage:
$ mecab -d /usr/local/lib/mecab/dic/mecab-ipadic-neologd ...

[install-mecab-ipadic-NEologd] : Finish..
[install-mecab-ipadic-NEologd] : Finish..

mecab-ipadic-neologdを使ってみる

実際に動かしてみるコマンドは↓になります。「-d」オプションを使用して、辞書の格納先を示すことがポイントです。
mecab -d /usr/local/lib/mecab/dic/mecab-ipadic-neologd/
↓が実際に動かしたログです。
echo "10日放送の「中居正広のミになる図書館」(テレビ朝日系)で、SMAPの中居正広が、篠原信一の過去の勘違いを明かす一幕があった。" | mecab -d /usr/local/lib/mecab/dic/mecab-ipadic-neologd
10日 名詞,固有名詞,一般,*,*,*,10日,トオカ,トオカ
放送 名詞,サ変接続,*,*,*,*,放送,ホウソウ,ホーソー
の 助詞,連体化,*,*,*,*,の,ノ,ノ
「 記号,括弧開,*,*,*,*,「,「,「
中居正広のミになる図書館 名詞,固有名詞,一般,*,*,*,中居正広の身になる図書館,ナカイマサヒロノミニナルトショカン,ナ カイマサヒロノミニナルトショカン
」 記号,括弧閉,*,*,*,*,」,」,」
( 記号,括弧開,*,*,*,*,(,(,(
テレビ朝日 名詞,固有名詞,組織,*,*,*,テレビ朝日,テレビアサヒ,テレビアサヒ
系 名詞,接尾,一般,*,*,*,系,ケイ,ケイ
) 記号,括弧閉,*,*,*,*,),),)
で 助詞,格助詞,一般,*,*,*,で,デ,デ
、 記号,読点,*,*,*,*,、,、,、
SMAP 名詞,固有名詞,一般,*,*,*,SMAP,スマップ,スマップ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
中居正広 名詞,固有名詞,人名,一般,*,*,中居正広,ナカイマサヒロ,ナカイマサヒロ
が 助詞,格助詞,一般,*,*,*,が,ガ,ガ
、 記号,読点,*,*,*,*,、,、,、
篠原信一 名詞,固有名詞,人名,一般,*,*,篠原信一,シノハラシンイチ,シノハラシンイチ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
過去 名詞,副詞可能,*,*,*,*,過去,カコ,カコ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
勘違い 名詞,サ変接続,*,*,*,*,勘違い,カンチガイ,カンチガイ
を 助詞,格助詞,一般,*,*,*,を,ヲ,ヲ
明かす 動詞,自立,*,*,五段・サ行,基本形,明かす,アカス,アカス
一幕 名詞,一般,*,*,*,*,一幕,ヒトマク,ヒトマク
が 助詞,格助詞,一般,*,*,*,が,ガ,ガ
あっ 動詞,自立,*,*,五段・ラ行,連用タ接続,ある,アッ,アッ
た 助動詞,*,*,*,特殊・タ,基本形,た,タ,タ
。 記号,句点,*,*,*,*,。,。,。
EOS
中居正広のミになる図書館 」が新語として使用出来ていることが確認できます。

mecab-ipadic-neologdのオススメ設定

cronを使用して常にをアップグレードすることがおススメです。↓がcronの定義です。
00 03 * * 2 ./bin/install-mecab-ipadic-neologd -n -y -u -p /path/to/user/directory > /path/to/log/file
00 03 * * 5 ./bin/install-mecab-ipadic-neologd -n -y -u -p /path/to/user/directory > /path/to/log/file
この例ですと、火・金曜日の午前 3 時に最新の辞書データに更新が掛かります。

マニュアル場所

参考https://github.com/neologd/mecab-ipadic-neologd/blob/master/README.ja.md

 

関連記事

-AI

Copyright© スタートアップIT企業社長のブログ , 2020 All Rights Reserved.