大会プログラム

料理レシピの評価モデルにおける画像とテキストの表現方法についての検討

○野中芽依¹⁾、大竹恒平²⁾、生田目崇¹⁾

1) 中央大学
2) 東海大学

Abstract: 料理レシピを検索する際，写真とテキストの情報が主に参照されている．これら2種類の情報を組み合わせて各メニューを評価する試みがあり，画像とテキストの特徴量を同時に考慮したマルチモーダル結合ネットワークによる先行研究がある．しかし，使用する画像とテキストの特徴量ベクトルの抽出を行う際の特徴量ベクトルの大きさについて，情報を十分に表現できているのかについての吟味がされていない．そのため，本研究では，情報を表現可能な特徴量ベクトルについて評価するために，特徴量ベクトルの次元を変化させてマルチモーダルな特徴量ベクトルを作成し，次元の違いによるマルチモーダル結合ネットワークでのモデルの精度検証を行う．

When searching for a cooking recipe, we refer to two types of information: “cooking photo" and “text." There are some studies that evaluate each menu by combining two types of information, and there are previous studies by the multimodal coupling network that considers the features of images and texts at the same time. However, regarding the size of the feature vector when extracting the feature vector of the image and text, it has not been examined whether the feature can sufficiently express the information. Therefore, to evaluate the feature vector that can express information, we created multimodal feature vectors by changing the dimension of the feature vector and verified the accuracy of multimodal coupled networks with different dimensions.
Keywords: マルチモーダル結合ネットワーク，料理レシピ，ディープラーニング
Multimodal Coupled Network，Cooking Recipes，Deep Learning