# Artistic Painter Using VGG 19 Neural Network

TensorFlow_VGG_Artistic_Painter

# VGG 19 Model and Artistic Style Painter¶

Goal: In this project, I am going to use VGG 19-layer CNN deep neural network to learn and transfter artist style from an art work to a targeting photo. Fine details of the technique can be found in the paper A Neural Algorithm of Artistic Style.

VGG-19 model weights can be downloaded from VGG group website. They provided parameters with a Caffe model and even a matlab model to run using MatConvNet. This gives opportunity to reconstruct VGG-19 model in Python and of course TensorFlow. I will construct VGG-19 in TensorFlow and use it on both content reconstructing and style painting. The training was performed on a Lenovo W530 thinkpad laptop with Nvidia K1000M mobile graphic card

Starting with an image with random noise, this deep-learning painter will merge content from a photo and learning style from an art work of Claude Monet to create someting wonderful. The final image is a Monet style painting based on real life photo

In [1]:
# TensorFlow CNN Tutorial
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import scipy.io
import scipy.misc
import PIL
from PIL import ImageOps, Image
import time
import sys
from six.moves import urllib

# Use CPU only, since GPU is occupied by CIFAR10
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE-DEVICES"]=""    # Change to '0' to use tf.device("/gpu:0")

%matplotlib inline

In [2]:
style_source = 'water_lily_monet.jpg'
simg = PIL.Image.open(style_source)
plt.imshow(simg)
plt.title('Water Lilies - Claude Monet')
plt.axis('off')
plt.show()
simg = np.asarray(simg, np.float32)
print "Original image size: ", simg.shape

Original image size:  (1076, 781, 3)

In [3]:
# Shrink image to speedup processing
HEIGHT = 400
WIDTH = 300

# Define a function to resize image
def resize_img(org_img_file, height, width):
oimg = Image.open(org_img_file)
rz_img = ImageOps.fit(oimg, (width, height), Image.ANTIALIAS)
rz_img_np = np.asarray(rz_img, np.float32)
return rz_img_np

s_img = resize_img(style_source, HEIGHT, WIDTH)
plt.title('Water Lilies - Claude Monet, Resized')
plt.axis('off')
a_simg = s_img.astype(np.uint8)
plt.imshow(a_simg)
plt.show()
print "Resized art image size : ", s_img.shape

Resized art image size :  (400, 300, 3)

In [4]:
target_source = 'water_lily_photo.jpg'
t_img = resize_img(target_source, HEIGHT, WIDTH)
plt.title('Water LilY Photo, Resized')
plt.axis('off')
a_t_img = t_img.astype(np.uint8)
plt.imshow(a_t_img)
plt.show()
print "Resized photo size : ", t_img.shape

Resized photo size :  (400, 300, 3)

In [5]:
# Download VGG-19 model
vgg_model_url = 'http://www.vlfeat.org/matconvnet/models/imagenet-vgg-verydeep-19.mat'
#vgg_model_file = 'imagenet-vgg-verydeep-19.mat'
expected_bytes = 534904783

# To Do: add auto retry with delay if network connection error found
"""
Args:
Return:
filenpath
"""
if not os.path.exists(dest_directory):
os.makedirs(dest_directory)
filepath = os.path.join(dest_directory, filename)

if not os.path.exists(filepath):
def _progress(count, block_size, total_size):
sys.stdout.flush()
filepath, _ = urllib.request.urlretrieve(url_link, filepath, _progress)
print '\n'

statinfo = os.stat(filepath)
if statinfo.st_size == expected_bytes:
else:
return filepath


Successfully downloaded imagenet-vgg-verydeep-19.mat 534904783 bytes.

In [6]:
# Load the mat into python
# scipy.io load into a dictionary
print "VGG mat first level structure:"
for key in mat_dict.keys():
print "\t", key
layers = mat_dict['layers']
print "VGG-19 model version:", mat_dict['__version__']
print "VGG-19 parameter matrix shape:", layers.shape

VGG mat first level structure:
layers
__version__
meta
__globals__
VGG-19 model version: 1.0
VGG-19 parameter matrix shape: (1, 43)

In [7]:
# Play with matrix to explore its structure
# print mat_dict['__globals__'] # Empty
# print mat_dict['meta']  # List of classes, averageImage, etc.
# print layers  # too many information, needs to peel the onion
# print layers[0][2]  # this level selects layers
# print layers[0][42] # last layer, softmax layer
# print layers[0][2][0][0][0]  # here is the layer name, such as 'conv1_2'
# print layers[0][3][0][0][1] # this is for layer type, such as 'conv'
# print layers[0][2][0][0][2]  # Found layer parameters, finally!
# Double check whether matches with what we are looking for, using the first layer
# First layer takes 3-channel image, process it with (3x3) convolutional kernel
layer_name = layers[0][0][0][0][0]
print "Layer name:", layer_name
w = layers[0][0][0][0][2][0][0]
print "Weight shape:", w.shape # using (3x3) kernel, to process 3 channels of input image and there are 64 such kernels
b = layers[0][0][0][0][2][0][1]
print "Bias shape:", b.shape # 64 bias parameters for 64 kernels or 64 hidden units

Layer name: [u'conv1_1']
Weight shape: (3, 3, 3, 64)
Bias shape: (64, 1)

In [8]:
# Prepare images for VGG
# Minus mean image of VGG-19 model
# Also prepare VGG-19 model parameters matrix
vgg_mean = mat_dict['meta'][0][0][-1][0][0][2][0][0]
vgg_mean = np.array(vgg_mean)
vgg_mean = np.reshape(vgg_mean, (1,1,3)) # prepare for minus mean image using broadcasting

t_img = t_img - vgg_mean
s_img = s_img - vgg_mean

t_img = np.expand_dims(t_img, 0)
s_img = np.expand_dims(s_img, 0)

In [9]:
# Extract VGG Deep Neural Network Layers
vgg_param_map = {}
vgg_layer_list = []
for l in xrange(layers[0].shape[0]):
layer_name = layers[0][l][0][0][0][0].decode('utf-8')
layer_type = layers[0][l][0][0][1][0].decode('utf-8')
param = layers[0][l][0][0][2]
vgg_param_map[layer_name] = param
vgg_layer_list.append(layer_name)
print "Layer: %s, type: %s, shape: %s" % (layer_name, layer_type, param.shape)

Layer: conv1_1, type: conv, shape: (1, 2)
Layer: relu1_1, type: relu, shape: (1, 1)
Layer: conv1_2, type: conv, shape: (1, 2)
Layer: relu1_2, type: relu, shape: (1, 1)
Layer: pool1, type: pool, shape: (1,)
Layer: conv2_1, type: conv, shape: (1, 2)
Layer: relu2_1, type: relu, shape: (1, 1)
Layer: conv2_2, type: conv, shape: (1, 2)
Layer: relu2_2, type: relu, shape: (1, 1)
Layer: pool2, type: pool, shape: (1,)
Layer: conv3_1, type: conv, shape: (1, 2)
Layer: relu3_1, type: relu, shape: (1, 1)
Layer: conv3_2, type: conv, shape: (1, 2)
Layer: relu3_2, type: relu, shape: (1, 1)
Layer: conv3_3, type: conv, shape: (1, 2)
Layer: relu3_3, type: relu, shape: (1, 1)
Layer: conv3_4, type: conv, shape: (1, 2)
Layer: relu3_4, type: relu, shape: (1, 1)
Layer: pool3, type: pool, shape: (1,)
Layer: conv4_1, type: conv, shape: (1, 2)
Layer: relu4_1, type: relu, shape: (1, 1)
Layer: conv4_2, type: conv, shape: (1, 2)
Layer: relu4_2, type: relu, shape: (1, 1)
Layer: conv4_3, type: conv, shape: (1, 2)
Layer: relu4_3, type: relu, shape: (1, 1)
Layer: conv4_4, type: conv, shape: (1, 2)
Layer: relu4_4, type: relu, shape: (1, 1)
Layer: pool4, type: pool, shape: (1,)
Layer: conv5_1, type: conv, shape: (1, 2)
Layer: relu5_1, type: relu, shape: (1, 1)
Layer: conv5_2, type: conv, shape: (1, 2)
Layer: relu5_2, type: relu, shape: (1, 1)
Layer: conv5_3, type: conv, shape: (1, 2)
Layer: relu5_3, type: relu, shape: (1, 1)
Layer: conv5_4, type: conv, shape: (1, 2)
Layer: relu5_4, type: relu, shape: (1, 1)
Layer: pool5, type: pool, shape: (1,)
Layer: fc6, type: conv, shape: (1, 2)
Layer: relu6, type: relu, shape: (1, 1)
Layer: fc7, type: conv, shape: (1, 2)
Layer: relu7, type: relu, shape: (1, 1)
Layer: fc8, type: conv, shape: (1, 2)
Layer: prob, type: softmax, shape: (0, 0)

In [10]:
# Define utility for generating white noise image
def generate_white_noise_image(height, width, std=30):
white_noise_image = np.random.uniform(-std, std, (1, height, width, 3)).astype(np.float32)
return white_noise_image

# Define utilies for constructing VGG CNN graph
def get_cnn_param(vgg_param_map, layer_name):
"""
Inputs:
- vgg_param_map: dict for VGG layer parameters
- layer_name: (string) layer name, must match with the list of VGG layers
Outputs:
- Tuple of (W, b) of CNN layer
"""

W = vgg_param_map[layer_name][0][0]
b = vgg_param_map[layer_name][0][1]
b = b.reshape(b.size)
return W, b

def conv2d_layer(vgg_param_map, input, layer_name):
"""
Inputs:
- vgg_param_map: dict for VGG layer parameters
- input: input data from previous layer
- layer_name: (string) VGG layer name
Output:
- output data after CNN + Relu
"""

# Since in this project we are not going to change these parameters
# We will use constants
with tf.variable_scope(layer_name):
W, b = get_cnn_param(vgg_param_map, layer_name)
W = tf.constant(W, name=layer_name + '_weights')
b = tf.constant(b, name=layer_name + '_bias')
cnn = tf.nn.conv2d(input,
filter=W,
strides=[1,1,1,1],
output = cnn + b
return output

def relu_layer(input, layer_name):
"""
Apply relu nonlinearity to input
Inputs:
- input: input data
- layer_name: name of layer
Return:
- output: after applying Relu
"""
with tf.variable_scope(layer_name):
output = tf.nn.relu(input)
return output

def affine_layer(vgg_param_map, input, layer_name):
"""
Extract VGG-19 parameter based on layer name, construct fully connected layer
VGG-19 uses CNN to implement affien layer
"""
with tf.variable_scope(layer_name):
W, b = get_cnn_param(vgg_param_map, layer_name)
W = tf.constant(W, name=layer_name + '_weights')
b = tf.constant(b, name=layer_name + '_bias')
cnn = tf.nn.conv2d(input,
filter=W,
strides=[1,1,1,1],
output = cnn + b
#output = tf.matmul(input, W) + b
return output

def pool_layer(input, layer_name, maxpool=False):
"""
Pooling layer
Instead of max_pool, the paper recommends using average pooling

Inputs:
- input : input data from previous CNN layer
- layer_name: VGG-19 layer name
Returns:
- pool_op: average pooling results of 2x2 window per test case, per channel
"""
pool_op = None
if maxpool:
pool_op = tf.nn.max_pool(input,
ksize=[1,2,2,1],
strides=[1,2,2,1],
name='max_pool_layer' + layer_name)
else:
pool_op = tf.nn.avg_pool(input,
ksize=[1,2,2,1],
strides=[1,2,2,1],
name='avg_pool_layer' + layer_name)
return pool_op

In [11]:
# Define a class for VGG model
class VggModel(object):
"""
Build neural network layers for VGG-19
"""

def __init__(self, vgg_param_dict, vgg_layer_list,
img_height,
img_width,
learning_rate=1e-2,
content_layer='conv4_2',
style_layers_config={'conv1_1':0.2,
'conv2_1':0.2,
'conv3_1':0.2,
'conv4_1':0.2,
'conv5_1':0.2},
vgg_mean_pixel=[123.68, 116.779, 103.939],
content_weight=0.01,
verbose=False
):

...

def _create_placeholders(self):
...

def _content_loss(self, p, f):
...

def _gram_matrix(self, F, N, M):
...

def _one_style_loss(self, a, g):
...

def _style_loss(self, A):
...

def _total_loss(self, content_image, style_image):
...

def build_vgg_graph(self):
...

def _create_summary(self):
...

def train(self, content_image, style_image):
...


In [12]:
# test vgg graph creation
model = VggModel(vgg_param_map, vgg_layer_list,
img_height=HEIGHT, img_width=WIDTH,
content_weight=0.01,
learning_rate=1e2)
model.train(t_img, s_img)

Step 301
Sum: 38453098.8
Loss: 397448128.0
Time: 36.7496809959
Step 306
Sum: 38292527.3
Loss: 395066944.0
Time: 20.4376380444
Step 311
Sum: 38134227.2
Loss: 392785984.0
Time: 20.5191249847
Step 316
Sum: 37978276.3
Loss: 390609600.0
Time: 20.7634079456
Step 321
Sum: 37824888.5
Loss: 388543008.0
Time: 20.7791938782
Step 326
Sum: 37673826.5
Loss: 386593632.0
Time: 22.163271904
Step 331
Sum: 37524825.1
Loss: 386233984.0
Time: 20.9584538937
Step 336
Sum: 37377706.7
Loss: 384119008.0
Time: 20.9911539555
Step 341
Sum: 37231463.5
Loss: 382698176.0
Time: 26.2235519886
Step 346
Sum: 37086969.6
Loss: 382249760.0
Time: 21.2615199089

In [14]:
# Plot trained images, from random noise to painting style
img_step0 = np.array(Image.open('outputs/0.png'))
img_step30 = np.array(Image.open('outputs/30.png'))
img_step100 = np.array(Image.open('outputs/100.png'))
img_step200 = np.array(Image.open('outputs/200.png'))
img_step300 = np.array(Image.open('outputs/300.png'))
img_step345 = np.array(Image.open('outputs/345.png'))
plt.subplot(2,4,1)
plt.title('Style')
plt.axis('off')
plt.imshow(a_simg)
plt.subplot(2,4,2)
plt.title('Content')
plt.axis('off')
plt.imshow(a_t_img)
plt.subplot(2,4,3)
plt.title('Step 0')
plt.axis('off')
plt.imshow(img_step0)
plt.subplot(2,4,4)
plt.title('Step 30')
plt.axis('off')
plt.imshow(img_step30)
plt.subplot(2,4,5)
plt.title('Step 100')
plt.axis('off')
plt.imshow(img_step100)
plt.subplot(2,4,6)
plt.title('Step 200')
plt.axis('off')
plt.imshow(img_step200)
plt.subplot(2,4,7)
plt.title('Step 300')
plt.axis('off')
plt.imshow(img_step300)
plt.subplot(2,4,8)
plt.title('Step 345')
plt.axis('off')
plt.imshow(img_step345)
plt.show()