add SRVGGNetCompact arch, update inference

This commit is contained in:
Xintao
2021-12-12 13:29:21 +08:00
parent 3e0085aeda
commit 696e1a6741
7 changed files with 139 additions and 62 deletions

4
FAQ.md
View File

@@ -1,9 +1,5 @@
# FAQ # FAQ
1. **What is the difference of `--netscale` and `outscale`?**
A: TODO.
1. **How to select models?** 1. **How to select models?**
A: TODO. A: TODO.

View File

@@ -166,7 +166,7 @@ wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_
Inference! Inference!
```bash ```bash
python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input inputs --face_enhance python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs --face_enhance
``` ```
Results are in the `results` folder Results are in the `results` folder
@@ -184,7 +184,7 @@ Pre-trained models: [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real
# download model # download model
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models
# inference # inference
python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth --input inputs python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs
``` ```
Results are in the `results` folder Results are in the `results` folder
@@ -194,23 +194,23 @@ Results are in the `results` folder
1. You can use X4 model for **arbitrary output size** with the argument `outscale`. The program will further perform cheap resize operation after the Real-ESRGAN output. 1. You can use X4 model for **arbitrary output size** with the argument `outscale`. The program will further perform cheap resize operation after the Real-ESRGAN output.
```console ```console
Usage: python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input infile --output outfile [options]... Usage: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile -o outfile [options]...
A common command: python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input infile --netscale 4 --outscale 3.5 --half --face_enhance A common command: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile --outscale 3.5 --half --face_enhance
-h show this help -h show this help
--input Input image or folder. Default: inputs -i --input Input image or folder. Default: inputs
--output Output folder. Default: results -o --output Output folder. Default: results
--model_path Path to the pre-trained model. Default: experiments/pretrained_models/RealESRGAN_x4plus.pth -n --model_name Model name. Default: RealESRGAN_x4plus
--netscale Upsample scale factor of the network. Default: 4 -s, --outscale The final upsampling scale of the image. Default: 4
--outscale The final upsampling scale of the image. Default: 4
--suffix Suffix of the restored image. Default: out --suffix Suffix of the restored image. Default: out
--tile Tile size, 0 for no tile during testing. Default: 0 -t, --tile Tile size, 0 for no tile during testing. Default: 0
--face_enhance Whether to use GFPGAN to enhance face. Default: False --face_enhance Whether to use GFPGAN to enhance face. Default: False
--half Whether to use half precision during inference. Default: False --half Whether to use half precision during inference. Default: False
--ext Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto --ext Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto
``` ```
## :european_castle: Model Zoo ## :european_castle: Model Zoo
- [RealESRGAN_x4plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth): X4 model for general images - [RealESRGAN_x4plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth): X4 model for general images

View File

@@ -162,7 +162,7 @@ wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_
推断! 推断!
```bash ```bash
python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input inputs --face_enhance python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs --face_enhance
``` ```
结果在`results`文件夹 结果在`results`文件夹
@@ -180,28 +180,27 @@ python inference_realesrgan.py --model_path experiments/pretrained_models/RealES
# 下载模型 # 下载模型
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models
# 推断 # 推断
python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth --input inputs python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs
``` ```
结果在`results`文件夹 结果在`results`文件夹
### Python 脚本的用法 ### Python 脚本的用法
1. 虽然你用了 X4 模型,但是你可以 **输出任意尺寸比例的图片**,只要实用了 `outscale` 参数. 程序会进一步对模型的输出图像进行缩放。 1. 虽然你使用了 X4 模型,但是你可以 **输出任意尺寸比例的图片**,只要实用了 `outscale` 参数. 程序会进一步对模型的输出图像进行缩放。
```console ```console
Usage: python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input infile --output outfile [options]... Usage: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile -o outfile [options]...
A common command: python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input infile --netscale 4 --outscale 3.5 --half --face_enhance A common command: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile --outscale 3.5 --half --face_enhance
-h show this help -h show this help
--input Input image or folder. Default: inputs -i --input Input image or folder. Default: inputs
--output Output folder. Default: results -o --output Output folder. Default: results
--model_path Path to the pre-trained model. Default: experiments/pretrained_models/RealESRGAN_x4plus.pth -n --model_name Model name. Default: RealESRGAN_x4plus
--netscale Upsample scale factor of the network. Default: 4 -s, --outscale The final upsampling scale of the image. Default: 4
--outscale The final upsampling scale of the image. Default: 4
--suffix Suffix of the restored image. Default: out --suffix Suffix of the restored image. Default: out
--tile Tile size, 0 for no tile during testing. Default: 0 -t, --tile Tile size, 0 for no tile during testing. Default: 0
--face_enhance Whether to use GFPGAN to enhance face. Default: False --face_enhance Whether to use GFPGAN to enhance face. Default: False
--half Whether to use half precision during inference. Default: False --half Whether to use half precision during inference. Default: False
--ext Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto --ext Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto

View File

@@ -1,12 +1,13 @@
# Anime model # Anime Model
:white_check_mark: We add [*RealESRGAN_x4plus_anime_6B.pth*](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth), which is optimized for **anime** images with much smaller model size. :white_check_mark: We add [*RealESRGAN_x4plus_anime_6B.pth*](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth), which is optimized for **anime** images with much smaller model size.
- [How to Use](#How-to-Use) - [Anime Model](#anime-model)
- [PyTorch Inference](#PyTorch-Inference) - [How to Use](#how-to-use)
- [ncnn Executable File](#ncnn-Executable-File) - [PyTorch Inference](#pytorch-inference)
- [Comparisons with waifu2x](#Comparisons-with-waifu2x) - [ncnn Executable File](#ncnn-executable-file)
- [Comparisons with Sliding Bars](#Comparions-with-Sliding-Bars) - [Comparisons with waifu2x](#comparisons-with-waifu2x)
- [Comparisons with Sliding Bars](#comparisons-with-sliding-bars)
<p align="center"> <p align="center">
<img src="https://raw.githubusercontent.com/xinntao/public-figures/master/Real-ESRGAN/cmp_realesrgan_anime_1.png"> <img src="https://raw.githubusercontent.com/xinntao/public-figures/master/Real-ESRGAN/cmp_realesrgan_anime_1.png">
@@ -26,7 +27,7 @@ Pre-trained models: [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real
# download model # download model
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models
# inference # inference
python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth --input inputs python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs
``` ```
### ncnn Executable File ### ncnn Executable File

View File

@@ -5,28 +5,30 @@ import os
from basicsr.archs.rrdbnet_arch import RRDBNet from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer from realesrgan import RealESRGANer
from realesrgan.archs.srvgg_arch import SRVGGNetCompact
def main(): def main():
"""Inference demo for Real-ESRGAN. """Inference demo for Real-ESRGAN.
""" """
parser = argparse.ArgumentParser() parser = argparse.ArgumentParser()
parser.add_argument('--input', type=str, default='inputs', help='Input image or folder') parser.add_argument('-i', '--input', type=str, default='inputs', help='Input image or folder')
parser.add_argument( parser.add_argument(
'--model_path', '-n',
'--model_name',
type=str, type=str,
default='experiments/pretrained_models/RealESRGAN_x4plus.pth', default='RealESRGAN_x4plus',
help='Path to the pre-trained model') help=('Model names: RealESRGAN_x4plus | RealESRNet_x4plus | RealESRGAN_x4plus_anime_6B | RealESRGAN_x2plus'
parser.add_argument('--output', type=str, default='results', help='Output folder') 'RealESRGANv2-anime-xsx2 | RealESRGANv2-animevideo-xsx2-nousm | RealESRGANv2-animevideo-xsx2'
parser.add_argument('--netscale', type=int, default=4, help='Upsample scale factor of the network') 'RealESRGANv2-anime-xsx4 | RealESRGANv2-animevideo-xsx4-nousm | RealESRGANv2-animevideo-xsx4'))
parser.add_argument('--outscale', type=float, default=4, help='The final upsampling scale of the image') parser.add_argument('-o', '--output', type=str, default='results', help='Output folder')
parser.add_argument('-s', '--outscale', type=float, default=4, help='The final upsampling scale of the image')
parser.add_argument('--suffix', type=str, default='out', help='Suffix of the restored image') parser.add_argument('--suffix', type=str, default='out', help='Suffix of the restored image')
parser.add_argument('--tile', type=int, default=0, help='Tile size, 0 for no tile during testing') parser.add_argument('-t', '--tile', type=int, default=0, help='Tile size, 0 for no tile during testing')
parser.add_argument('--tile_pad', type=int, default=10, help='Tile padding') parser.add_argument('--tile_pad', type=int, default=10, help='Tile padding')
parser.add_argument('--pre_pad', type=int, default=0, help='Pre padding size at each border') parser.add_argument('--pre_pad', type=int, default=0, help='Pre padding size at each border')
parser.add_argument('--face_enhance', action='store_true', help='Use GFPGAN to enhance face') parser.add_argument('--face_enhance', action='store_true', help='Use GFPGAN to enhance face')
parser.add_argument('--half', action='store_true', help='Use half precision during inference') parser.add_argument('--half', action='store_true', help='Use half precision during inference')
parser.add_argument('--block', type=int, default=23, help='num_block in RRDB')
parser.add_argument( parser.add_argument(
'--alpha_upsampler', '--alpha_upsampler',
type=str, type=str,
@@ -39,16 +41,39 @@ def main():
help='Image extension. Options: auto | jpg | png, auto means using the same extension as inputs') help='Image extension. Options: auto | jpg | png, auto means using the same extension as inputs')
args = parser.parse_args() args = parser.parse_args()
if 'RealESRGAN_x4plus_anime_6B.pth' in args.model_path: # determine models according to model names
args.block = 6 args.model_name = args.model_name.split('.')[0]
elif 'RealESRGAN_x2plus.pth' in args.model_path: if args.model_name in ['RealESRGAN_x4plus', 'RealESRNet_x4plus']: # x4 RRDBNet model
args.netscale = 2 model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
netscale = 4
elif args.model_name in ['RealESRGAN_x4plus_anime_6B']: # x4 RRDBNet model with 6 blocks
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=6, num_grow_ch=32, scale=4)
netscale = 4
elif args.model_name in ['RealESRGAN_x2plus']: # x2 RRDBNet model
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2)
netscale = 2
elif args.model_name in [
'RealESRGANv2-anime-xsx2', 'RealESRGANv2-animevideo-xsx2-nousm', 'RealESRGANv2-animevideo-xsx2'
]: # x2 VGG-style model (XS size)
model = SRVGGNetCompact(num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=2, act_type='prelu')
netscale = 2
elif args.model_name in [
'RealESRGANv2-anime-xsx4', 'RealESRGANv2-animevideo-xsx4-nousm', 'RealESRGANv2-animevideo-xsx4'
]: # x4 VGG-style model (XS size)
model = SRVGGNetCompact(num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=4, act_type='prelu')
netscale = 4
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=args.block, num_grow_ch=32, scale=args.netscale) # determine model paths
model_path = os.path.join('experiments/pretrained_models', args.model_name + '.pth')
if not os.path.isfile(model_path):
model_path = os.path.join('realesrgan/weights', args.model_name + '.pth')
if not os.path.isfile(model_path):
raise ValueError(f'Model {args.model_name} does not exist.')
# restorer
upsampler = RealESRGANer( upsampler = RealESRGANer(
scale=args.netscale, scale=netscale,
model_path=args.model_path, model_path=model_path,
model=model, model=model,
tile=args.tile, tile=args.tile,
tile_pad=args.tile_pad, tile_pad=args.tile_pad,
@@ -80,15 +105,6 @@ def main():
else: else:
img_mode = None img_mode = None
# give warnings for too large/small images
h, w = img.shape[0:2]
if max(h, w) > 1000 and args.netscale == 4:
import warnings
warnings.warn('The input image is large, try X2 model for better performance.')
if max(h, w) < 500 and args.netscale == 2:
import warnings
warnings.warn('The input image is small, try X4 model for better performance.')
try: try:
if args.face_enhance: if args.face_enhance:
_, _, output = face_enhancer.enhance(img, has_aligned=False, only_center_face=False, paste_back=True) _, _, output = face_enhancer.enhance(img, has_aligned=False, only_center_face=False, paste_back=True)

View File

@@ -0,0 +1,69 @@
from basicsr.utils.registry import ARCH_REGISTRY
from torch import nn as nn
from torch.nn import functional as F
@ARCH_REGISTRY.register()
class SRVGGNetCompact(nn.Module):
"""A compact VGG-style network structure for super-resolution.
It is a compact network structure, which performs upsampling in the last layer and no convolution is
conducted on the HR feature space.
Args:
num_in_ch (int): Channel number of inputs. Default: 3.
num_out_ch (int): Channel number of outputs. Default: 3.
num_feat (int): Channel number of intermediate features. Default: 64.
num_conv (int): Number of convolution layers in the body network. Default: 16.
upscale (int): Upsampling factor. Default: 4.
act_type (str): Activation type, options: 'relu', 'prelu', 'leakyrelu'. Default: prelu.
"""
def __init__(self, num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=4, act_type='prelu'):
super(SRVGGNetCompact, self).__init__()
self.num_in_ch = num_in_ch
self.num_out_ch = num_out_ch
self.num_feat = num_feat
self.num_conv = num_conv
self.upscale = upscale
self.act_type = act_type
self.body = nn.ModuleList()
# the first conv
self.body.append(nn.Conv2d(num_in_ch, num_feat, 3, 1, 1))
# the first activation
if act_type == 'relu':
activation = nn.ReLU(inplace=True)
elif act_type == 'prelu':
activation = nn.PReLU(num_parameters=num_feat)
elif act_type == 'leakyrelu':
activation = nn.LeakyReLU(negative_slope=0.1, inplace=True)
self.body.append(activation)
# the body structure
for _ in range(num_conv):
self.body.append(nn.Conv2d(num_feat, num_feat, 3, 1, 1))
# activation
if act_type == 'relu':
activation = nn.ReLU(inplace=True)
elif act_type == 'prelu':
activation = nn.PReLU(num_parameters=num_feat)
elif act_type == 'leakyrelu':
activation = nn.LeakyReLU(negative_slope=0.1, inplace=True)
self.body.append(activation)
# the last conv
self.body.append(nn.Conv2d(num_feat, num_out_ch * upscale * upscale, 3, 1, 1))
# upsample
self.upsampler = nn.PixelShuffle(upscale)
def forward(self, x):
out = x
for i in range(0, len(self.body)):
out = self.body[i](out)
out = self.upsampler(out)
# add the nearest upsampled image, so that the network learns the residual
base = F.interpolate(x, scale_factor=self.upscale, mode='nearest')
out += base
return out

View File

@@ -3,7 +3,6 @@ import math
import numpy as np import numpy as np
import os import os
import torch import torch
from basicsr.archs.rrdbnet_arch import RRDBNet
from basicsr.utils.download_util import load_file_from_url from basicsr.utils.download_util import load_file_from_url
from torch.nn import functional as F from torch.nn import functional as F
@@ -16,7 +15,7 @@ class RealESRGANer():
Args: Args:
scale (int): Upsampling scale factor used in the networks. It is usually 2 or 4. scale (int): Upsampling scale factor used in the networks. It is usually 2 or 4.
model_path (str): The path to the pretrained model. It can be urls (will first download it automatically). model_path (str): The path to the pretrained model. It can be urls (will first download it automatically).
model (nn.Module): The defined network. If None, the model will be constructed here. Default: None. model (nn.Module): The defined network. Default: None.
tile (int): As too large images result in the out of GPU memory issue, so this tile option will first crop tile (int): As too large images result in the out of GPU memory issue, so this tile option will first crop
input images into tiles, and then process each of them. Finally, they will be merged into one image. input images into tiles, and then process each of them. Finally, they will be merged into one image.
0 denotes for do not use tile. Default: 0. 0 denotes for do not use tile. Default: 0.
@@ -35,9 +34,6 @@ class RealESRGANer():
# initialize model # initialize model
self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
if model is None:
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=scale)
# if the model_path starts with https, it will first download models to the folder: realesrgan/weights # if the model_path starts with https, it will first download models to the folder: realesrgan/weights
if model_path.startswith('https://'): if model_path.startswith('https://'):
model_path = load_file_from_url( model_path = load_file_from_url(