V0.2.3.0: add anime video models

Update readme for anime video models; add video demo (#181 )
* update readme * update readme * update readme * update readme * update readme * update readme * update readme * update readme * update readme
2021-12-12 20:19:09 +08:00 · 2021-12-12 20:17:30 +08:00 · 2021-12-12 16:49:35 +08:00 · 2021-12-12 13:29:21 +08:00
13 changed files with 586 additions and 104 deletions
--- a/FAQ.md
+++ b/FAQ.md
@@ -1,9 +1,5 @@
 # FAQ

-1. **What is the difference of `--netscale` and `outscale`?**
-
-A: TODO.
-
 1. **How to select models?**

 A: TODO.
--- a/README.md
+++ b/README.md
@@ -10,12 +10,10 @@

 [English](README.md) **|** [简体中文](README_CN.md)

+:fire: :fire: :fire: Add **small video models** for anime videos (**针对动漫视频的小模型**). Please see [anime video models](docs/anime_video_model.md).
+
 1. [Colab Demo](https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing) for Real-ESRGAN <a href="https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="google colab logo"></a>.
-2. Portable [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-macos.zip) **executable files for Intel/AMD/Nvidia GPU**. You can find more information [here](#Portable-executable-files). The ncnn implementation is in [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan).
-
-Thanks for your interests and use:-) There are still many problems about the anime/illustration model, mainly including: 1. It cannot deal with videos; 2. It cannot be aware of depth/depth-of-field; 3. It is not adjustable; 4. May change the original style. Thanks for your valuable feedbacks/suggestions. All the feedbacks are updated in [feedback.md](feedback.md). Hopefully, a new model will be available soon.
-
-感谢大家的关注和使用:-) 关于动漫插画的模型，目前还有很多问题，主要有: 1. 视频处理不了; 2. 景深虚化有问题; 3. 不可调节, 效果过了; 4. 改变原来的风格。大家提供了很好的反馈。这些反馈会逐步更新在 [这个文档](feedback.md)。希望不久之后，有新模型可以使用.
+2. Portable [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-macos.zip) **executable files for Intel/AMD/Nvidia GPU**. You can find more information [here](#Portable-executable-files). The ncnn implementation is in [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan).

 Real-ESRGAN aims at developing **Practical Algorithms for General Image Restoration**.<br>
 We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data.
@@ -24,7 +22,10 @@ We extend the powerful ESRGAN to a practical restoration application (namely, Re

 :question: Frequently Asked Questions can be found in [FAQ.md](FAQ.md) (Well, it is still empty there =-=||).

+:milky_way: Thanks for your valuable feedbacks/suggestions. All the feedbacks are updated in [feedback.md](feedback.md).
+
 :triangular_flag_on_post: **Updates**
+- :white_check_mark: Add small models for anime videos. More details are in [anime video models](docs/anime_video_model.md).
 - :white_check_mark: Add the ncnn implementation [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan).
 - :white_check_mark: Add [*RealESRGAN_x4plus_anime_6B.pth*](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth), which is optimized for **anime** images with much smaller model size. More details and comparisons with [waifu2x](https://github.com/nihui/waifu2x-ncnn-vulkan) are in [**anime_model.md**](docs/anime_model.md)
 - :white_check_mark: Support finetuning on your own data or paired data (*i.e.*, finetuning ESRGAN). See [here](Training.md#Finetune-Real-ESRGAN-on-your-own-dataset)
@@ -80,21 +81,23 @@ If you have some images that Real-ESRGAN could not well restored, please also op

 ### Portable executable files

-You can download [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.2/realesrgan-ncnn-vulkan-20210801-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.2/realesrgan-ncnn-vulkan-20210801-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.2/realesrgan-ncnn-vulkan-20210801-macos.zip) **executable files for Intel/AMD/Nvidia GPU**.
+You can download [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-macos.zip) **executable files for Intel/AMD/Nvidia GPU**.

 This executable file is **portable** and includes all the binaries and models required. No CUDA or PyTorch environment is needed.<br>

 You can simply run the following command (the Windows example, more information is in the README.md of each executable files):

 ```bash
-./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png
+./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png -n model_name
 ```

-We have provided three models:
+We have provided five models:

 1. realesrgan-x4plus  (default)
 2. realesrnet-x4plus
 3. realesrgan-x4plus-anime (optimized for anime images, small model size)
+4. RealESRGANv2-animevideo-xsx2 (anime video, X2)
+5. RealESRGANv2-animevideo-xsx4 (anime video, X4)

 You can use the `-n` argument for other models, for example, `./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png -n realesrnet-x4plus`

@@ -166,7 +169,7 @@ wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_
 Inference!

 ```bash
-python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input inputs --face_enhance
+python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs --face_enhance
 ```

 Results are in the `results` folder
@@ -184,7 +187,7 @@ Pre-trained models: [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real
 # download model
 wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models
 # inference
-python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth --input inputs
+python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs
 ```

 Results are in the `results` folder
@@ -194,37 +197,26 @@ Results are in the `results` folder
 1. You can use X4 model for **arbitrary output size** with the argument `outscale`. The program will further perform cheap resize operation after the Real-ESRGAN output.

 ```console
-Usage: python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input infile --output outfile [options]...
+Usage: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile -o outfile [options]...

-A common command: python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input infile --netscale 4 --outscale 3.5 --half --face_enhance
+A common command: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile --outscale 3.5 --half --face_enhance

  -h                   show this help
-  --input              Input image or folder. Default: inputs
-  --output             Output folder. Default: results
-  --model_path         Path to the pre-trained model. Default: experiments/pretrained_models/RealESRGAN_x4plus.pth
-  --netscale           Upsample scale factor of the network. Default: 4
-  --outscale           The final upsampling scale of the image. Default: 4
+  -i --input           Input image or folder. Default: inputs
+  -o --output          Output folder. Default: results
+  -n --model_name      Model name. Default: RealESRGAN_x4plus
+  -s, --outscale       The final upsampling scale of the image. Default: 4
  --suffix             Suffix of the restored image. Default: out
-  --tile               Tile size, 0 for no tile during testing. Default: 0
+  -t, --tile           Tile size, 0 for no tile during testing. Default: 0
  --face_enhance       Whether to use GFPGAN to enhance face. Default: False
  --half               Whether to use half precision during inference. Default: False
  --ext                Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto
 ```

+
 ## :european_castle: Model Zoo

- [RealESRGAN_x4plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth): X4 model for general images
- [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth): Optimized for anime images; 6 RRDB blocks (slightly smaller network)
- [RealESRGAN_x2plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth): X2 model for general images
- [RealESRNet_x4plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.1/RealESRNet_x4plus.pth): X4 model with MSE loss (over-smooth effects)
-
- [official ESRGAN_x4](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.1/ESRGAN_SRx4_DF2KOST_official-ff704c30.pth): official ESRGAN model (X4)
-
-The following models are **discriminators**, which are usually used for fine-tuning.
-
- [RealESRGAN_x4plus_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x4plus_netD.pth)
- [RealESRGAN_x2plus_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x2plus_netD.pth)
- [RealESRGAN_x4plus_anime_6B_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B_netD.pth)
+Please see [docs/model_zoo.md](docs/model_zoo.md)

 ## :computer: Training and Finetuning on your own dataset

--- a/README_CN.md
+++ b/README_CN.md
@@ -10,19 +10,22 @@

 [English](README.md) **|** [简体中文](README_CN.md)

-1. Real-ESRGAN的[Colab Demo](https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing) <a href="https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="google colab logo"></a>.
-2. **支持Intel/AMD/Nvidia显卡**的绿色版exe文件： [Windows版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-windows.zip) / [Linux版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-ubuntu.zip) / [macOS版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-macos.zip)，详情请移步[这里](#便携版（绿色版）可执行文件)。NCNN的实现在 [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan)。
+:fire: :fire: :fire: 添加了**针对动漫视频的小模型**, 更多信息在 [anime video models](docs/anime_video_model.md) 中.

-感谢大家的关注和使用:-) 关于动漫插画的模型，目前还有很多问题，主要有: 1. 视频处理不了; 2. 景深虚化有问题; 3. 不可调节, 效果过了; 4. 改变原来的风格。大家提供了很好的反馈。这些反馈会逐步更新在 [这个文档](feedback.md)。希望不久之后，有新模型可以使用.
+1. Real-ESRGAN的[Colab Demo](https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing) <a href="https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="google colab logo"></a>.
+2. **支持Intel/AMD/Nvidia显卡**的绿色版exe文件： [Windows版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-windows.zip) / [Linux版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-ubuntu.zip) / [macOS版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-macos.zip)，详情请移步[这里](#便携版（绿色版）可执行文件)。NCNN的实现在 [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan)。

 Real-ESRGAN 的目标是开发出**实用的图像修复算法**。<br>
 我们在 ESRGAN 的基础上使用纯合成的数据来进行训练，以使其能被应用于实际的图片修复的场景（顾名思义：Real-ESRGAN）。

 :art: Real-ESRGAN 需要，也很欢迎你的贡献，如新功能、模型、bug修复、建议、维护等等。详情可以查看[CONTRIBUTING.md](CONTRIBUTING.md)，所有的贡献者都会被列在[此处](README_CN.md#hugs-感谢)。

+:milky_way: 感谢大家提供了很好的反馈。这些反馈会逐步更新在 [这个文档](feedback.md)。
+
 :question: 常见的问题可以在[FAQ.md](FAQ.md)中找到答案。（好吧，现在还是空白的=-=||）

 :triangular_flag_on_post: **更新**
+- :white_check_mark: 添加了针对动漫视频的小模型, 更多信息在 [anime video models](docs/anime_video_model.md) 中.
 - :white_check_mark: 添加了ncnn 实现：[Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan).
 - :white_check_mark: 添加了 [*RealESRGAN_x4plus_anime_6B.pth*](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth)，对二次元图片进行了优化，并减少了model的大小。详情 以及 与[waifu2x](https://github.com/nihui/waifu2x-ncnn-vulkan)的对比请查看[**anime_model.md**](docs/anime_model.md)
 - :white_check_mark: 支持用户在自己的数据上进行微调 (finetune)：[详情](Training.md#Finetune-Real-ESRGAN-on-your-own-dataset)
@@ -76,21 +79,23 @@ Real-ESRGAN 将会被长期支持，我会在空闲的时间中持续维护更

 ### 便携版（绿色版）可执行文件

-你可以下载**支持Intel/AMD/Nvidia显卡**的绿色版exe文件： [Windows版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-windows.zip) / [Linux版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-ubuntu.zip) / [macOS版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-macos.zip)。
+你可以下载**支持Intel/AMD/Nvidia显卡**的绿色版exe文件： [Windows版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-windows.zip) / [Linux版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-ubuntu.zip) / [macOS版](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-macos.zip)。

 绿色版指的是这些exe你可以直接运行（放U盘里拷走都没问题），因为里面已经有所需的文件和模型了。它不需要 CUDA 或者 PyTorch运行环境。<br>

 你可以通过下面这个命令来运行（Windows版本的例子，更多信息请查看对应版本的README.md）：

 ```bash
-./realesrgan-ncnn-vulkan.exe -i 输入图像.jpg -o 输出图像.png
+./realesrgan-ncnn-vulkan.exe -i 输入图像.jpg -o 输出图像.png -n 模型名字
 ```

-我们提供了三种模型：
+我们提供了五种模型：

 1. realesrgan-x4plus（默认）
 2. reaesrnet-x4plus
 3. realesrgan-x4plus-anime（针对动漫插画图像优化，有更小的体积）
+4. RealESRGANv2-animevideo-xsx2 (针对动漫视频, X2)
+5. RealESRGANv2-animevideo-xsx4 (针对动漫视频, X4)

 你可以通过`-n`参数来使用其他模型，例如`./realesrgan-ncnn-vulkan.exe -i 二次元图片.jpg -o 二刺螈图片.png -n realesrgan-x4plus-anime`

@@ -162,7 +167,7 @@ wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_
 推断!

 ```bash
-python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input inputs --face_enhance
+python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs --face_enhance
 ```

 结果在`results`文件夹
@@ -180,28 +185,27 @@ python inference_realesrgan.py --model_path experiments/pretrained_models/RealES
 # 下载模型
 wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models
 # 推断
-python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth --input inputs
+python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs
 ```

 结果在`results`文件夹

 ### Python 脚本的用法

-1. 虽然你实用了 X4 模型，但是你可以 **输出任意尺寸比例的图片**，只要实用了 `outscale` 参数. 程序会进一步对模型的输出图像进行缩放。
+1. 虽然你使用了 X4 模型，但是你可以 **输出任意尺寸比例的图片**，只要实用了 `outscale` 参数. 程序会进一步对模型的输出图像进行缩放。

 ```console
-Usage: python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input infile --output outfile [options]...
+Usage: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile -o outfile [options]...

-A common command: python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --input infile --netscale 4 --outscale 3.5 --half --face_enhance
+A common command: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile --outscale 3.5 --half --face_enhance

  -h                   show this help
-  --input              Input image or folder. Default: inputs
-  --output             Output folder. Default: results
-  --model_path         Path to the pre-trained model. Default: experiments/pretrained_models/RealESRGAN_x4plus.pth
-  --netscale           Upsample scale factor of the network. Default: 4
-  --outscale           The final upsampling scale of the image. Default: 4
+  -i --input           Input image or folder. Default: inputs
+  -o --output          Output folder. Default: results
+  -n --model_name      Model name. Default: RealESRGAN_x4plus
+  -s, --outscale       The final upsampling scale of the image. Default: 4
  --suffix             Suffix of the restored image. Default: out
-  --tile               Tile size, 0 for no tile during testing. Default: 0
+  -t, --tile           Tile size, 0 for no tile during testing. Default: 0
  --face_enhance       Whether to use GFPGAN to enhance face. Default: False
  --half               Whether to use half precision during inference. Default: False
  --ext                Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto
@@ -209,18 +213,7 @@ A common command: python inference_realesrgan.py --model_path experiments/pretra

 ## :european_castle: 模型库

- [RealESRGAN_x4plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth): X4 model for general images
- [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth): Optimized for anime images; 6 RRDB blocks (slightly smaller network)
- [RealESRGAN_x2plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth): X2 model for general images
- [RealESRNet_x4plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.1/RealESRNet_x4plus.pth): X4 model with MSE loss (over-smooth effects)
-
- [official ESRGAN_x4](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.1/ESRGAN_SRx4_DF2KOST_official-ff704c30.pth): official ESRGAN model (X4)
-
-下面是 **判别器** 模型, 他们经常被用来微调（fine-tune）模型.
-
- [RealESRGAN_x4plus_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x4plus_netD.pth)
- [RealESRGAN_x2plus_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x2plus_netD.pth)
- [RealESRGAN_x4plus_anime_6B_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B_netD.pth)
+请参见 [docs/model_zoo.md](docs/model_zoo.md)

 ## :computer: 训练，在你的数据上微调（Fine-tune）

--- a/2
+++ b/2
@@ -1 +1 @@
-0.2.2.6
+0.2.3.0
--- a/docs/anime_model.md
+++ b/docs/anime_model.md
@@ -1,12 +1,13 @@
-# Anime model
+# Anime Model

 :white_check_mark: We add [*RealESRGAN_x4plus_anime_6B.pth*](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth), which is optimized for **anime** images with much smaller model size.

- [How to Use](#How-to-Use)
-  - [PyTorch Inference](#PyTorch-Inference)
-  - [ncnn Executable File](#ncnn-Executable-File)
- [Comparisons with waifu2x](#Comparisons-with-waifu2x)
- [Comparisons with Sliding Bars](#Comparions-with-Sliding-Bars)
+- [Anime Model](#anime-model)
+  - [How to Use](#how-to-use)
+    - [PyTorch Inference](#pytorch-inference)
+    - [ncnn Executable File](#ncnn-executable-file)
+  - [Comparisons with waifu2x](#comparisons-with-waifu2x)
+  - [Comparisons with Sliding Bars](#comparisons-with-sliding-bars)

 <p align="center">
  <img src="https://raw.githubusercontent.com/xinntao/public-figures/master/Real-ESRGAN/cmp_realesrgan_anime_1.png">
@@ -26,12 +27,12 @@ Pre-trained models: [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real
 # download model
 wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P experiments/pretrained_models
 # inference
-python inference_realesrgan.py --model_path experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth --input inputs
+python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs
 ```

 ### ncnn Executable File

-Download the latest portable [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/realesrgan-ncnn-vulkan-20210901-macos.zip) **executable files for Intel/AMD/Nvidia GPU**.
+Download the latest portable [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-macos.zip) **executable files for Intel/AMD/Nvidia GPU**.

 Taking the Windows as example, run:

--- a/docs/anime_video_model.md
+++ b/docs/anime_video_model.md
@@ -0,0 +1,121 @@
+# Anime Video Models
+
+:white_check_mark: We add small models that are optimized for anime videos :-)
+
+| Models                                                                                                                             | Scale | Description                    |
+| ---------------------------------------------------------------------------------------------------------------------------------- | :---- | :----------------------------- |
+| [RealESRGANv2-animevideo-xsx2](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/RealESRGANv2-animevideo-xsx2.pth) | X2    | Anime video model with XS size |
+| [RealESRGANv2-animevideo-xsx4](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/RealESRGANv2-animevideo-xsx4.pth) | X4    | Anime video model with XS size |
+
+- [Anime Video Models](#anime-video-models)
+  - [How to Use](#how-to-use)
+    - [PyTorch Inference](#pytorch-inference)
+    - [ncnn Executable File](#ncnn-executable-file)
+      - [Step 1: Use ffmpeg to extract frames from video](#step-1-use-ffmpeg-to-extract-frames-from-video)
+      - [Step 2: Inference with Real-ESRGAN executable file](#step-2-inference-with-real-esrgan-executable-file)
+      - [Step 3: Merge the enhanced frames back into a video](#step-3-merge-the-enhanced-frames-back-into-a-video)
+  - [More Demos](#more-demos)
+
+---
+
+The following are some demos (best view in the full screen mode).
+
+https://user-images.githubusercontent.com/17445847/145706977-98bc64a4-af27-481c-8abe-c475e15db7ff.MP4
+
+https://user-images.githubusercontent.com/17445847/145707055-6a4b79cb-3d9d-477f-8610-c6be43797133.MP4
+
+https://user-images.githubusercontent.com/17445847/145707046-8702a17c-a194-4471-8a53-a4cc44c9594c.MP4
+
+## How to Use
+
+### PyTorch Inference
+
+```bash
+# download model
+wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/RealESRGANv2-animevideo-xsx2.pth -P experiments/pretrained_models
+# inference
+python inference_realesrgan_video.py -i inputs/video/onepiece_demo.mp4 -n RealESRGANv2-animevideo-xsx2 -s 2 -v -a --half --suffix outx2
+```
+
+### ncnn Executable File
+
+#### Step 1: Use ffmpeg to extract frames from video
+
+```bash
+ffmpeg -i onepiece_demo.mp4 -qscale:v 1 -qmin 1 -qmax 1 -vsync 0 tmp_frames/frame%08d.png
+```
+
+- Remember to create the folder `tmp_frames` ahead
+
+#### Step 2: Inference with Real-ESRGAN executable file
+
+1. Download the latest portable [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/realesrgan-ncnn-vulkan-20211212-macos.zip) **executable files for Intel/AMD/Nvidia GPU**
+
+1. Taking the Windows as example, run:
+
+    ```bash
+    ./realesrgan-ncnn-vulkan.exe -i tmp_frames -o out_frames -n RealESRGANv2-animevideo-xsx2 -s 2 -f jpg
+    ```
+
+    - Remember to create the folder `out_frames` ahead
+
+#### Step 3: Merge the enhanced frames back into a video
+
+1. First obtain fps from input videos by
+
+    ```bash
+    ffmpeg -i onepiece_demo.mp4
+    ```
+
+    ```console
+    Usage:
+    -i                   input video path
+    ```
+
+    You will get the output similar to the following screenshot.
+
+    <p align="center">
+        <img src="https://user-images.githubusercontent.com/17445847/145710145-c4f3accf-b82f-4307-9f20-3803a2c73f57.png">
+    </p>
+
+2. Merge frames
+
+    ```bash
+    ffmpeg -i out_frames/frame%08d.jpg -c:v libx264 -r 23.98 -pix_fmt yuv420p output.mp4
+    ```
+
+    ```console
+    Usage:
+    -i                   input video path
+    -c:v                 video encoder (usually we use libx264)
+    -r                   fps, remember to modify it to meet your needs
+    -pix_fmt             pixel format in video
+    ```
+
+    If you also want to copy audio from the input videos, run:
+
+     ```bash
+    ffmpeg -i out_frames/frame%08d.jpg -i onepiece_demo.mp4 -map 0:v:0 -map 1:a:0 -c:a copy -c:v libx264 -r 23.98 -pix_fmt yuv420p output_w_audio.mp4
+    ```
+
+    ```console
+    Usage:
+    -i                   input video path, here we use two input streams
+    -c:v                 video encoder (usually we use libx264)
+    -r                   fps, remember to modify it to meet your needs
+    -pix_fmt             pixel format in video
+    ```
+
+## More Demos
+
+- Input video for One Piece:
+
+    https://user-images.githubusercontent.com/17445847/145706822-0e83d9c4-78ef-40ee-b2a4-d8b8c3692d17.mp4
+
+- Out video for One Piece
+
+    https://user-images.githubusercontent.com/17445847/145706827-384108c0-78f6-4aa7-9621-99d1aaf65682.mp4
+
+**More comparisons**
+
+https://user-images.githubusercontent.com/17445847/145707458-04a5e9b9-2edd-4d1f-b400-380a72e5f5e6.MP4
--- a/docs/model_zoo.md
+++ b/docs/model_zoo.md
@@ -0,0 +1,47 @@
+# :european_castle: Model Zoo
+
+- [:european_castle: Model Zoo](#european_castle-model-zoo)
+  - [For General Images](#for-general-images)
+  - [For Anime Images](#for-anime-images)
+  - [For Anime Videos](#for-anime-videos)
+
+---
+
+## For General Images
+
+| Models                                                                                                                          | Scale | Description                                  |
+| ------------------------------------------------------------------------------------------------------------------------------- | :---- | :------------------------------------------- |
+| [RealESRGAN_x4plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth)                      | X4    | X4 model for general images                  |
+| [RealESRGAN_x2plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth)                      | X2    | X2 model for general images                  |
+| [RealESRNet_x4plus](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.1/RealESRNet_x4plus.pth)                      | X4    | X4 model with MSE loss (over-smooth effects) |
+| [official ESRGAN_x4](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.1/ESRGAN_SRx4_DF2KOST_official-ff704c30.pth) | X4    | official ESRGAN model                        |
+
+The following models are **discriminators**, which are usually used for fine-tuning.
+
+| Models                                                                                                                 | Corresponding model |
+| ---------------------------------------------------------------------------------------------------------------------- | :------------------ |
+| [RealESRGAN_x4plus_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x4plus_netD.pth) | RealESRGAN_x4plus   |
+| [RealESRGAN_x2plus_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.3/RealESRGAN_x2plus_netD.pth) | RealESRGAN_x2plus   |
+
+## For Anime Images
+
+| Models                                                                                                                         | Scale | Description                                                 |
+| ------------------------------------------------------------------------------------------------------------------------------ | :---- | :---------------------------------------------------------- |
+| [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth) | X4    | Optimized for anime images; 6 RRDB blocks (smaller network) |
+
+The following models are **discriminators**, which are usually used for fine-tuning.
+
+| Models                                                                                                                                   | Corresponding model        |
+| ---------------------------------------------------------------------------------------------------------------------------------------- | :------------------------- |
+| [RealESRGAN_x4plus_anime_6B_netD](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B_netD.pth) | RealESRGAN_x4plus_anime_6B |
+
+## For Anime Videos
+
+| Models                                                                                                                             | Scale | Description                    |
+| ---------------------------------------------------------------------------------------------------------------------------------- | :---- | :----------------------------- |
+| [RealESRGANv2-animevideo-xsx2](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/RealESRGANv2-animevideo-xsx2.pth) | X2    | Anime video model with XS size |
+| [RealESRGANv2-animevideo-xsx4](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.3.0/RealESRGANv2-animevideo-xsx4.pth) | X4    | Anime video model with XS size |
+
+The following models are **discriminators**, which are usually used for fine-tuning.
+
+TODO
--- a/inference_realesrgan.py
+++ b/inference_realesrgan.py
@@ -5,28 +5,30 @@ import os
 from basicsr.archs.rrdbnet_arch import RRDBNet

 from realesrgan import RealESRGANer
+from realesrgan.archs.srvgg_arch import SRVGGNetCompact


 def main():
    """Inference demo for Real-ESRGAN.
    """
    parser = argparse.ArgumentParser()
-    parser.add_argument('--input', type=str, default='inputs', help='Input image or folder')
+    parser.add_argument('-i', '--input', type=str, default='inputs', help='Input image or folder')
    parser.add_argument(
-        '--model_path',
+        '-n',
+        '--model_name',
        type=str,
-        default='experiments/pretrained_models/RealESRGAN_x4plus.pth',
-        help='Path to the pre-trained model')
-    parser.add_argument('--output', type=str, default='results', help='Output folder')
-    parser.add_argument('--netscale', type=int, default=4, help='Upsample scale factor of the network')
-    parser.add_argument('--outscale', type=float, default=4, help='The final upsampling scale of the image')
+        default='RealESRGAN_x4plus',
+        help=('Model names: RealESRGAN_x4plus | RealESRNet_x4plus | RealESRGAN_x4plus_anime_6B | RealESRGAN_x2plus'
+              'RealESRGANv2-anime-xsx2 | RealESRGANv2-animevideo-xsx2-nousm | RealESRGANv2-animevideo-xsx2'
+              'RealESRGANv2-anime-xsx4 | RealESRGANv2-animevideo-xsx4-nousm | RealESRGANv2-animevideo-xsx4'))
+    parser.add_argument('-o', '--output', type=str, default='results', help='Output folder')
+    parser.add_argument('-s', '--outscale', type=float, default=4, help='The final upsampling scale of the image')
    parser.add_argument('--suffix', type=str, default='out', help='Suffix of the restored image')
-    parser.add_argument('--tile', type=int, default=0, help='Tile size, 0 for no tile during testing')
+    parser.add_argument('-t', '--tile', type=int, default=0, help='Tile size, 0 for no tile during testing')
    parser.add_argument('--tile_pad', type=int, default=10, help='Tile padding')
    parser.add_argument('--pre_pad', type=int, default=0, help='Pre padding size at each border')
    parser.add_argument('--face_enhance', action='store_true', help='Use GFPGAN to enhance face')
    parser.add_argument('--half', action='store_true', help='Use half precision during inference')
-    parser.add_argument('--block', type=int, default=23, help='num_block in RRDB')
    parser.add_argument(
        '--alpha_upsampler',
        type=str,
@@ -39,16 +41,39 @@ def main():
        help='Image extension. Options: auto | jpg | png, auto means using the same extension as inputs')
    args = parser.parse_args()

-    if 'RealESRGAN_x4plus_anime_6B.pth' in args.model_path:
-        args.block = 6
-    elif 'RealESRGAN_x2plus.pth' in args.model_path:
-        args.netscale = 2
+    # determine models according to model names
+    args.model_name = args.model_name.split('.')[0]
+    if args.model_name in ['RealESRGAN_x4plus', 'RealESRNet_x4plus']:  # x4 RRDBNet model
+        model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
+        netscale = 4
+    elif args.model_name in ['RealESRGAN_x4plus_anime_6B']:  # x4 RRDBNet model with 6 blocks
+        model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=6, num_grow_ch=32, scale=4)
+        netscale = 4
+    elif args.model_name in ['RealESRGAN_x2plus']:  # x2 RRDBNet model
+        model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2)
+        netscale = 2
+    elif args.model_name in [
+            'RealESRGANv2-anime-xsx2', 'RealESRGANv2-animevideo-xsx2-nousm', 'RealESRGANv2-animevideo-xsx2'
+    ]:  # x2 VGG-style model (XS size)
+        model = SRVGGNetCompact(num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=2, act_type='prelu')
+        netscale = 2
+    elif args.model_name in [
+            'RealESRGANv2-anime-xsx4', 'RealESRGANv2-animevideo-xsx4-nousm', 'RealESRGANv2-animevideo-xsx4'
+    ]:  # x4 VGG-style model (XS size)
+        model = SRVGGNetCompact(num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=4, act_type='prelu')
+        netscale = 4

-    model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=args.block, num_grow_ch=32, scale=args.netscale)
+    # determine model paths
+    model_path = os.path.join('experiments/pretrained_models', args.model_name + '.pth')
+    if not os.path.isfile(model_path):
+        model_path = os.path.join('realesrgan/weights', args.model_name + '.pth')
+    if not os.path.isfile(model_path):
+        raise ValueError(f'Model {args.model_name} does not exist.')

+    # restorer
    upsampler = RealESRGANer(
-        scale=args.netscale,
-        model_path=args.model_path,
+        scale=netscale,
+        model_path=model_path,
        model=model,
        tile=args.tile,
        tile_pad=args.tile_pad,
@@ -80,15 +105,6 @@ def main():
        else:
            img_mode = None

-        # give warnings for too large/small images
-        h, w = img.shape[0:2]
-        if max(h, w) > 1000 and args.netscale == 4:
-            import warnings
-            warnings.warn('The input image is large, try X2 model for better performance.')
-        if max(h, w) < 500 and args.netscale == 2:
-            import warnings
-            warnings.warn('The input image is small, try X4 model for better performance.')
-
        try:
            if args.face_enhance:
                _, _, output = face_enhancer.enhance(img, has_aligned=False, only_center_face=False, paste_back=True)
--- a/inference_realesrgan_video.py
+++ b/inference_realesrgan_video.py
@@ -0,0 +1,199 @@
+import argparse
+import glob
+import mimetypes
+import os
+import queue
+import shutil
+import torch
+from basicsr.archs.rrdbnet_arch import RRDBNet
+from basicsr.utils.logger import AvgTimer
+from tqdm import tqdm
+
+from realesrgan import IOConsumer, PrefetchReader, RealESRGANer
+from realesrgan.archs.srvgg_arch import SRVGGNetCompact
+
+
+def main():
+    """Inference demo for Real-ESRGAN.
+    It mainly for restoring anime videos.
+
+    """
+    parser = argparse.ArgumentParser()
+    parser.add_argument('-i', '--input', type=str, default='inputs', help='Input image or folder')
+    parser.add_argument(
+        '-n',
+        '--model_name',
+        type=str,
+        default='RealESRGAN_x4plus',
+        help=('Model names: RealESRGAN_x4plus | RealESRNet_x4plus | RealESRGAN_x4plus_anime_6B | RealESRGAN_x2plus'
+              'RealESRGANv2-anime-xsx2 | RealESRGANv2-animevideo-xsx2-nousm | RealESRGANv2-animevideo-xsx2'
+              'RealESRGANv2-anime-xsx4 | RealESRGANv2-animevideo-xsx4-nousm | RealESRGANv2-animevideo-xsx4'))
+    parser.add_argument('-o', '--output', type=str, default='results', help='Output folder')
+    parser.add_argument('-s', '--outscale', type=float, default=4, help='The final upsampling scale of the image')
+    parser.add_argument('--suffix', type=str, default='out', help='Suffix of the restored video')
+    parser.add_argument('-t', '--tile', type=int, default=0, help='Tile size, 0 for no tile during testing')
+    parser.add_argument('--tile_pad', type=int, default=10, help='Tile padding')
+    parser.add_argument('--pre_pad', type=int, default=0, help='Pre padding size at each border')
+    parser.add_argument('--face_enhance', action='store_true', help='Use GFPGAN to enhance face')
+    parser.add_argument('--half', action='store_true', help='Use half precision during inference')
+    parser.add_argument('-v', '--video', action='store_true', help='Output a video using ffmpeg')
+    parser.add_argument('-a', '--audio', action='store_true', help='Keep audio')
+    parser.add_argument('--fps', type=float, default=None, help='FPS of the output video')
+    parser.add_argument('--consumer', type=int, default=4, help='Number of IO consumers')
+
+    parser.add_argument(
+        '--alpha_upsampler',
+        type=str,
+        default='realesrgan',
+        help='The upsampler for the alpha channels. Options: realesrgan | bicubic')
+    parser.add_argument(
+        '--ext',
+        type=str,
+        default='auto',
+        help='Image extension. Options: auto | jpg | png, auto means using the same extension as inputs')
+    args = parser.parse_args()
+
+    # ---------------------- determine models according to model names ---------------------- #
+    args.model_name = args.model_name.split('.')[0]
+    if args.model_name in ['RealESRGAN_x4plus', 'RealESRNet_x4plus']:  # x4 RRDBNet model
+        model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
+        netscale = 4
+    elif args.model_name in ['RealESRGAN_x4plus_anime_6B']:  # x4 RRDBNet model with 6 blocks
+        model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=6, num_grow_ch=32, scale=4)
+        netscale = 4
+    elif args.model_name in ['RealESRGAN_x2plus']:  # x2 RRDBNet model
+        model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2)
+        netscale = 2
+    elif args.model_name in [
+            'RealESRGANv2-anime-xsx2', 'RealESRGANv2-animevideo-xsx2-nousm', 'RealESRGANv2-animevideo-xsx2'
+    ]:  # x2 VGG-style model (XS size)
+        model = SRVGGNetCompact(num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=2, act_type='prelu')
+        netscale = 2
+    elif args.model_name in [
+            'RealESRGANv2-anime-xsx4', 'RealESRGANv2-animevideo-xsx4-nousm', 'RealESRGANv2-animevideo-xsx4'
+    ]:  # x4 VGG-style model (XS size)
+        model = SRVGGNetCompact(num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=4, act_type='prelu')
+        netscale = 4
+
+    # ---------------------- determine model paths ---------------------- #
+    model_path = os.path.join('experiments/pretrained_models', args.model_name + '.pth')
+    if not os.path.isfile(model_path):
+        model_path = os.path.join('realesrgan/weights', args.model_name + '.pth')
+    if not os.path.isfile(model_path):
+        raise ValueError(f'Model {args.model_name} does not exist.')
+
+    # restorer
+    upsampler = RealESRGANer(
+        scale=netscale,
+        model_path=model_path,
+        model=model,
+        tile=args.tile,
+        tile_pad=args.tile_pad,
+        pre_pad=args.pre_pad,
+        half=args.half)
+
+    if args.face_enhance:  # Use GFPGAN for face enhancement
+        from gfpgan import GFPGANer
+        face_enhancer = GFPGANer(
+            model_path='https://github.com/TencentARC/GFPGAN/releases/download/v0.2.0/GFPGANCleanv1-NoCE-C2.pth',
+            upscale=args.outscale,
+            arch='clean',
+            channel_multiplier=2,
+            bg_upsampler=upsampler)
+    os.makedirs(args.output, exist_ok=True)
+    # for saving restored frames
+    save_frame_folder = os.path.join(args.output, 'frames_tmpout')
+    os.makedirs(save_frame_folder, exist_ok=True)
+
+    if mimetypes.guess_type(args.input)[0].startswith('video'):  # is a video file
+        video_name = os.path.splitext(os.path.basename(args.input))[0]
+        frame_folder = os.path.join('tmp_frames', video_name)
+        os.makedirs(frame_folder, exist_ok=True)
+        # use ffmpeg to extract frames
+        os.system(f'ffmpeg -i {args.input} -qscale:v 1 -qmin 1 -qmax 1 -vsync 0  {frame_folder}/frame%08d.png')
+        # get image path list
+        paths = sorted(glob.glob(os.path.join(frame_folder, '*')))
+        if args.video:
+            if args.fps is None:
+                # get input video fps
+                import ffmpeg
+                probe = ffmpeg.probe(args.input)
+                video_streams = [stream for stream in probe['streams'] if stream['codec_type'] == 'video']
+                args.fps = eval(video_streams[0]['avg_frame_rate'])
+    elif mimetypes.guess_type(args.input)[0].startswith('image'):  # is an image file
+        paths = [args.input]
+        video_name = 'video'
+    else:
+        paths = sorted(glob.glob(os.path.join(args.input, '*')))
+        video_name = 'video'
+
+    timer = AvgTimer()
+    timer.start()
+    pbar = tqdm(total=len(paths), unit='frame', desc='inference')
+    # set up prefetch reader
+    reader = PrefetchReader(paths, num_prefetch_queue=4)
+    reader.start()
+
+    que = queue.Queue()
+    consumers = [IOConsumer(args, que, f'IO_{i}') for i in range(args.consumer)]
+    for consumer in consumers:
+        consumer.start()
+
+    for idx, (path, img) in enumerate(zip(paths, reader)):
+        imgname, extension = os.path.splitext(os.path.basename(path))
+        if len(img.shape) == 3 and img.shape[2] == 4:
+            img_mode = 'RGBA'
+        else:
+            img_mode = None
+
+        try:
+            if args.face_enhance:
+                _, _, output = face_enhancer.enhance(img, has_aligned=False, only_center_face=False, paste_back=True)
+            else:
+                output, _ = upsampler.enhance(img, outscale=args.outscale)
+        except RuntimeError as error:
+            print('Error', error)
+            print('If you encounter CUDA out of memory, try to set --tile with a smaller number.')
+
+        else:
+            if args.ext == 'auto':
+                extension = extension[1:]
+            else:
+                extension = args.ext
+            if img_mode == 'RGBA':  # RGBA images should be saved in png format
+                extension = 'png'
+            save_path = os.path.join(save_frame_folder, f'{imgname}_out.{extension}')
+
+            que.put({'output': output, 'save_path': save_path})
+
+        pbar.update(1)
+        torch.cuda.synchronize()
+        timer.record()
+        avg_fps = 1. / (timer.get_avg_time() + 1e-7)
+        pbar.set_description(f'idx {idx}, fps {avg_fps:.2f}')
+
+    for _ in range(args.consumer):
+        que.put('quit')
+    for consumer in consumers:
+        consumer.join()
+    pbar.close()
+
+    # merge frames to video
+    if args.video:
+        video_save_path = os.path.join(args.output, f'{video_name}_{args.suffix}.mp4')
+        if args.audio:
+            os.system(
+                f'ffmpeg -r {args.fps} -i {save_frame_folder}/frame%08d_out.{extension} -i {args.input}'
+                f' -map 0:v:0 -map 1:a:0 -c:a copy -c:v libx264 -r {args.fps} -pix_fmt yuv420p  {video_save_path}')
+        else:
+            os.system(f'ffmpeg -i {save_frame_folder}/frame%08d_out.{extension} '
+                      f'-c:v libx264 -r {args.fps} -pix_fmt yuv420p {video_save_path}')
+
+        # delete tmp file
+        shutil.rmtree(save_frame_folder)
+        if os.path.isdir(frame_folder):
+            shutil.rmtree(frame_folder)
+
+
+if __name__ == '__main__':
+    main()
--- a/inputs/video/onepiece_demo.mp4
+++ b/inputs/video/onepiece_demo.mp4
--- a/realesrgan/init.py
+++ b/realesrgan/init.py
@@ -3,4 +3,4 @@ from .archs import *
 from .data import *
 from .models import *
 from .utils import *
-from .version import __version__
+from .version import *
--- a/realesrgan/archs/srvgg_arch.py
+++ b/realesrgan/archs/srvgg_arch.py
@@ -0,0 +1,69 @@
+from basicsr.utils.registry import ARCH_REGISTRY
+from torch import nn as nn
+from torch.nn import functional as F
+
+
+@ARCH_REGISTRY.register()
+class SRVGGNetCompact(nn.Module):
+    """A compact VGG-style network structure for super-resolution.
+
+    It is a compact network structure, which performs upsampling in the last layer and no convolution is
+    conducted on the HR feature space.
+
+    Args:
+        num_in_ch (int): Channel number of inputs. Default: 3.
+        num_out_ch (int): Channel number of outputs. Default: 3.
+        num_feat (int): Channel number of intermediate features. Default: 64.
+        num_conv (int): Number of convolution layers in the body network. Default: 16.
+        upscale (int): Upsampling factor. Default: 4.
+        act_type (str): Activation type, options: 'relu', 'prelu', 'leakyrelu'. Default: prelu.
+    """
+
+    def __init__(self, num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=4, act_type='prelu'):
+        super(SRVGGNetCompact, self).__init__()
+        self.num_in_ch = num_in_ch
+        self.num_out_ch = num_out_ch
+        self.num_feat = num_feat
+        self.num_conv = num_conv
+        self.upscale = upscale
+        self.act_type = act_type
+
+        self.body = nn.ModuleList()
+        # the first conv
+        self.body.append(nn.Conv2d(num_in_ch, num_feat, 3, 1, 1))
+        # the first activation
+        if act_type == 'relu':
+            activation = nn.ReLU(inplace=True)
+        elif act_type == 'prelu':
+            activation = nn.PReLU(num_parameters=num_feat)
+        elif act_type == 'leakyrelu':
+            activation = nn.LeakyReLU(negative_slope=0.1, inplace=True)
+        self.body.append(activation)
+
+        # the body structure
+        for _ in range(num_conv):
+            self.body.append(nn.Conv2d(num_feat, num_feat, 3, 1, 1))
+            # activation
+            if act_type == 'relu':
+                activation = nn.ReLU(inplace=True)
+            elif act_type == 'prelu':
+                activation = nn.PReLU(num_parameters=num_feat)
+            elif act_type == 'leakyrelu':
+                activation = nn.LeakyReLU(negative_slope=0.1, inplace=True)
+            self.body.append(activation)
+
+        # the last conv
+        self.body.append(nn.Conv2d(num_feat, num_out_ch * upscale * upscale, 3, 1, 1))
+        # upsample
+        self.upsampler = nn.PixelShuffle(upscale)
+
+    def forward(self, x):
+        out = x
+        for i in range(0, len(self.body)):
+            out = self.body[i](out)
+
+        out = self.upsampler(out)
+        # add the nearest upsampled image, so that the network learns the residual
+        base = F.interpolate(x, scale_factor=self.upscale, mode='nearest')
+        out += base
+        return out
--- a/realesrgan/utils.py
+++ b/realesrgan/utils.py
@@ -2,8 +2,9 @@ import cv2
 import math
 import numpy as np
 import os
+import queue
+import threading
 import torch
-from basicsr.archs.rrdbnet_arch import RRDBNet
 from basicsr.utils.download_util import load_file_from_url
 from torch.nn import functional as F

@@ -16,7 +17,7 @@ class RealESRGANer():
    Args:
        scale (int): Upsampling scale factor used in the networks. It is usually 2 or 4.
        model_path (str): The path to the pretrained model. It can be urls (will first download it automatically).
-        model (nn.Module): The defined network. If None, the model will be constructed here. Default: None.
+        model (nn.Module): The defined network. Default: None.
        tile (int): As too large images result in the out of GPU memory issue, so this tile option will first crop
            input images into tiles, and then process each of them. Finally, they will be merged into one image.
            0 denotes for do not use tile. Default: 0.
@@ -35,14 +36,11 @@ class RealESRGANer():

        # initialize model
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
-        if model is None:
-            model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=scale)
-
        # if the model_path starts with https, it will first download models to the folder: realesrgan/weights
        if model_path.startswith('https://'):
            model_path = load_file_from_url(
                url=model_path, model_dir=os.path.join(ROOT_DIR, 'realesrgan/weights'), progress=True, file_name=None)
-        loadnet = torch.load(model_path)
+        loadnet = torch.load(model_path, map_location=torch.device('cpu'))
        # prefer to use params_ema
        if 'params_ema' in loadnet:
            keyname = 'params_ema'
@@ -230,3 +228,53 @@ class RealESRGANer():
                ), interpolation=cv2.INTER_LANCZOS4)

        return output, img_mode
+
+
+class PrefetchReader(threading.Thread):
+    """Prefetch images.
+
+    Args:
+        img_list (list[str]): A image list of image paths to be read.
+        num_prefetch_queue (int): Number of prefetch queue.
+    """
+
+    def __init__(self, img_list, num_prefetch_queue):
+        super().__init__()
+        self.que = queue.Queue(num_prefetch_queue)
+        self.img_list = img_list
+
+    def run(self):
+        for img_path in self.img_list:
+            img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED)
+            self.que.put(img)
+
+        self.que.put(None)
+
+    def __next__(self):
+        next_item = self.que.get()
+        if next_item is None:
+            raise StopIteration
+        return next_item
+
+    def __iter__(self):
+        return self
+
+
+class IOConsumer(threading.Thread):
+
+    def __init__(self, opt, que, qid):
+        super().__init__()
+        self._queue = que
+        self.qid = qid
+        self.opt = opt
+
+    def run(self):
+        while True:
+            msg = self._queue.get()
+            if isinstance(msg, str) and msg == 'quit':
+                break
+
+            output = msg['output']
+            save_path = msg['save_path']
+            cv2.imwrite(save_path, output)
+        print(f'IO worker {self.qid} is done.')
Author	SHA1	Message	Date
Xintao	f07aaffda0	V0.2.3.0: add anime video models	2021-12-12 20:19:09 +08:00
Xintao	20355e0c79	Update readme for anime video models; add video demo (#181 ) * update readme * update readme * update readme * update readme * update readme * update readme * update readme * update readme * update readme	2021-12-12 20:17:30 +08:00
Xintao	192f672f91	add inference_realesrgan_video	2021-12-12 16:49:35 +08:00
Xintao	696e1a6741	add SRVGGNetCompact arch, update inference	2021-12-12 13:29:21 +08:00
@@ -1 +1 @@
 .2.2.6
 .2.3.0