卷積自編碼器利用了傳統自編碼器的無監督的學習方式,結合了卷積神經網路的卷積和池化操作,從而實現特徵提取,最後通過stack,實現一個深層的神經網路。
具體文章可以參考:
Masci J, Meier U, Cireşan D, et al. Stacked convolutional auto-encoders for hierarchical feature extraction[C]//International Conference on Artificial Neural Networks. Springer Berlin Heidelberg, 2011: 52-59. 無監督學習
無監督學習可以在沒有標記的情況下去學習樣本的特徵
The main purpose of unsupervised learning method is to extract generally useful features from unlabelled, to detect and remove input redundancies and to preserve only essential aspects of the data in robust and discriminative representations.
而卷積自編碼器建立的目的就在於,利用卷積神經網路的卷積和池化操作,實現特徵不變性提取(invariant feature)的無監督特徵提取。 卷積神經網路和傳統自編碼器
卷積神經網路由一個由卷積和池化組成的神經網路。卷積的作用相當於一個濾波器,而池化則是提取不變特徵。其網路結構如下圖所示:
自編碼器則是一個由輸入層,隱含層,輸出層所構成的神經網路,其結構如下圖所示:
通過利用輸入層與輸出層之間的映射關係,實現樣本重構,從而提取特徵。 卷積自編碼器
假設我們有k個卷積核,每個卷積核由參數 wk w^k和 bk b^k組成,用和 hk h^k表示卷積層,則
hk=σ(x∗wk+bk) h^k = \sigma(x * w^k + b^k)
將得到的 hk h^k進行特徵重構,可以得到下式:
y=σ(hk∗w^k+c) y = \sigma(h^k * \hat w^k + c)
將輸入的樣本和最終利用特徵重構得出來的結果進行歐幾裡得距離比較,通過BP演算法進行最佳化,就可以得到一個完整的卷積自編碼器(CAE)
E=12n∑(xi−yi)2 E = \frac{1}{2n}\sum(x_i-y_i)^2
代碼
這個是在github找到的一個基於keras的代碼:
def getModel(): input_img = Input(shape=(48, 48, 1)) x = Convolution2D(16, 3, 3, activation='relu', border_mode='same', dim_ordering='tf')(input_img) x = MaxPooling2D((2, 2), border_mode='same', dim_ordering='tf')(x) x = Convolution2D(32, 3, 3, activation='relu', border_mode='same', dim_ordering='tf')(input_img) x = MaxPooling2D((2, 2), border_mode='same', dim_ordering='tf')(x) x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', dim_ordering='tf')(x) encoded = MaxPooling2D((2, 2), border_mode='same', dim_ordering='tf')(x) #6x6x32 -- bottleneck x = UpSampling2D((2, 2), dim_ordering='tf')(encoded) x = Convolution2D(32, 3, 3, activation='relu', border_mode='same', dim_ordering='tf')(x) x = UpSampling2D((2, 2), dim_ordering='tf')(x) x = Convolution2D(16, 3, 3, activation='relu', border_mode='same', dim_ordering='tf')(x) decoded = Convolution2D(3, 3, 3, activation='relu', border_mode='same', dim_ordering='tf')(x) #Create model autoencoder = Model(input_img, decoded) return autoencoder# Trains the model for 10 epochsdef trainModel(): # Load dataset print("Loading dataset...") x_train_gray, x_train, x_test_gray, x_test = getDataset() # Create model description print("Creating model...") model = getModel() model.compile(optimizer='rmsprop', loss='binary_crossentropy',metrics=['accuracy']) # Train model print("Training model...") model.fit(x_train_gray, x_train, nb_epoch=10, batch_size=148, shuffle=True, validation_data=(x_test_gray, x_test), callbacks=[TensorBoard(log_dir='/tmp/tb', histogram_freq=0, write_graph=False)]) # Evaluate loaded model on test data print("Evaluating model...") score = model.evaluate(x_train_gray, x_train, verbose=0) print "%s: %.2f%%" % (model.metrics_names[1], score[1]*100) # Serialize model to JSON print("Saving model...") model_json = model.to_json() with open("model.json", "w") as json_file: json_file.write(model_json) # Serialize weights to HDF5 print("Saving weights...") model.save_weights("model.h5")