1. refine quick start and pipeline doc 2. remove tf pytorch easynlp from requirements 3. lazy import for torch and tensorflow 4. test successfully on linux and mac intel cpu 5. update api doc Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/8882373master
| @@ -0,0 +1,7 @@ | |||||
| maas\_lib.pipelines.audio package | |||||
| ================================= | |||||
| .. automodule:: maas_lib.pipelines.audio | |||||
| :members: | |||||
| :undoc-members: | |||||
| :show-inheritance: | |||||
| @@ -0,0 +1,18 @@ | |||||
| maas\_lib.trainers.nlp package | |||||
| ============================== | |||||
| .. automodule:: maas_lib.trainers.nlp | |||||
| :members: | |||||
| :undoc-members: | |||||
| :show-inheritance: | |||||
| Submodules | |||||
| ---------- | |||||
| maas\_lib.trainers.nlp.sequence\_classification\_trainer module | |||||
| --------------------------------------------------------------- | |||||
| .. automodule:: maas_lib.trainers.nlp.sequence_classification_trainer | |||||
| :members: | |||||
| :undoc-members: | |||||
| :show-inheritance: | |||||
| @@ -0,0 +1,34 @@ | |||||
| maas\_lib.trainers package | |||||
| ========================== | |||||
| .. automodule:: maas_lib.trainers | |||||
| :members: | |||||
| :undoc-members: | |||||
| :show-inheritance: | |||||
| Subpackages | |||||
| ----------- | |||||
| .. toctree:: | |||||
| :maxdepth: 4 | |||||
| maas_lib.trainers.nlp | |||||
| Submodules | |||||
| ---------- | |||||
| maas\_lib.trainers.base module | |||||
| ------------------------------ | |||||
| .. automodule:: maas_lib.trainers.base | |||||
| :members: | |||||
| :undoc-members: | |||||
| :show-inheritance: | |||||
| maas\_lib.trainers.builder module | |||||
| --------------------------------- | |||||
| .. automodule:: maas_lib.trainers.builder | |||||
| :members: | |||||
| :undoc-members: | |||||
| :show-inheritance: | |||||
| @@ -0,0 +1,31 @@ | |||||
| # 常见问题 | |||||
| <a name="macos-pip-tokenizer-error"></a> | |||||
| ### 1. macOS环境pip方式安装tokenizers报错 | |||||
| 对于tokenizers库, pypi上缺乏针对`macOS`环境预编译包,需要搭建源码编译环境后才能正确安装,步骤如下: | |||||
| 1. 安装rust | |||||
| ```shell | |||||
| curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh | |||||
| pip install setuptools_rust | |||||
| ``` | |||||
| 2. 更新rust环境变量 | |||||
| ```shell | |||||
| source $HOME/.cargo/env | |||||
| ``` | |||||
| 3. 安装tokenziers | |||||
| ```shell | |||||
| pip install tokenziers | |||||
| ``` | |||||
| reference: [https://huggingface.co/docs/tokenizers/installation#installation-from-sources](https://huggingface.co/docs/tokenizers/installation#installation-from-sources) | |||||
| ### 2. pip 安装包冲突 | |||||
| > ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. | |||||
| 由于依赖库之间的版本不兼容,可能会存在版本冲突的情况,大部分情况下不影响正常运行。 | |||||
| @@ -1,17 +1,53 @@ | |||||
| # 快速开始 | # 快速开始 | ||||
| ## 环境准备 | |||||
| ## python环境配置 | |||||
| 首先,参考[文档](https://docs.anaconda.com/anaconda/install/) 安装配置Anaconda环境 | |||||
| 方式一: whl包安装, 执行如下命令 | |||||
| 安装完成后,执行如下命令为maas library创建对应的python环境。 | |||||
| ```shell | ```shell | ||||
| pip install http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/release/maas_lib-0.1.0-py3-none-any.whl | |||||
| conda create -n maas python=3.6 | |||||
| conda activate maas | |||||
| ``` | ``` | ||||
| 检查python和pip命令是否切换到conda环境下。 | |||||
| ```shell | |||||
| which python | |||||
| # ~/workspace/anaconda3/envs/maas/bin/python | |||||
| which pip | |||||
| # ~/workspace/anaconda3/envs/maas/bin/pip | |||||
| ``` | |||||
| 注: 本项目只支持`python3`环境,请勿使用python2环境。 | |||||
| ## 第三方依赖安装 | |||||
| MaaS Library支持tensorflow,pytorch两大深度学习框架进行模型训练、推理, 在Python 3.6+, Pytorch 1.8+, Tensorflow 2.6上测试可运行,用户可以根据所选模型对应的计算框架进行安装,可以参考如下链接进行安装所需框架: | |||||
| * [Pytorch安装指导](https://pytorch.org/get-started/locally/) | |||||
| * [Tensorflow安装指导](https://www.tensorflow.org/install/pip) | |||||
| 方式二: 源码环境指定, 适合本地开发调试使用,修改源码后可以直接执行 | |||||
| ## MaaS library 安装 | |||||
| 注: 如果在安装过程中遇到错误,请前往[常见问题](faq.md)查找解决方案。 | |||||
| ### pip安装 | |||||
| ```shell | |||||
| pip install -r http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/release/maas/maas.txt | |||||
| ``` | |||||
| 安装成功后,可以执行如下命令进行验证安装是否正确 | |||||
| ```shell | |||||
| python -c "from maas_lib.pipelines import pipeline;print(pipeline('image-matting',model='damo/image-matting-person')('http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png'))" | |||||
| ``` | |||||
| ### 使用源码 | |||||
| 适合本地开发调试使用,修改源码后可以直接执行 | |||||
| ```shell | ```shell | ||||
| git clone git@gitlab.alibaba-inc.com:Ali-MaaS/MaaS-lib.git maaslib | git clone git@gitlab.alibaba-inc.com:Ali-MaaS/MaaS-lib.git maaslib | ||||
| git fetch origin release/0.1 | |||||
| git checkout release/0.1 | |||||
| git fetch origin master | |||||
| git checkout master | |||||
| cd maaslib | cd maaslib | ||||
| @@ -22,7 +58,11 @@ pip install -r requirements.txt | |||||
| export PYTHONPATH=`pwd` | export PYTHONPATH=`pwd` | ||||
| ``` | ``` | ||||
| 备注: mac arm cpu暂时由于依赖包版本问题会导致requirements暂时无法安装,请使用mac intel cpu, linux cpu/gpu机器测试。 | |||||
| 安装成功后,可以执行如下命令进行验证安装是否正确 | |||||
| ```shell | |||||
| python -c "from maas_lib.pipelines import pipeline;print(pipeline('image-matting',model='damo/image-matting-person')('http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png'))" | |||||
| ``` | |||||
| ## 训练 | ## 训练 | ||||
| @@ -34,31 +74,22 @@ to be done | |||||
| to be done | to be done | ||||
| ## 推理 | ## 推理 | ||||
| to be done | |||||
| <!-- pipeline函数提供了简洁的推理接口,示例如下 | |||||
| 注: 这里提供的接口是完成和modelhub打通后的接口,暂时不支持使用。pipeline使用示例请参考 [pipelien tutorial](tutorials/pipeline.md)给出的示例。 | |||||
| pipeline函数提供了简洁的推理接口,示例如下, 更多pipeline介绍和示例请参考[pipeline使用教程](tutorials/pipeline.md) | |||||
| ```python | ```python | ||||
| import cv2 | import cv2 | ||||
| import os.path as osp | |||||
| from maas_lib.pipelines import pipeline | from maas_lib.pipelines import pipeline | ||||
| from maas_lib.utils.constant import Tasks | |||||
| # 根据任务名创建pipeline | # 根据任务名创建pipeline | ||||
| img_matting = pipeline('image-matting') | |||||
| # 根据任务和模型名创建pipeline | |||||
| img_matting = pipeline('image-matting', model='damo/image-matting-person') | |||||
| img_matting = pipeline( | |||||
| Tasks.image_matting, model='damo/image-matting-person') | |||||
| # 自定义模型和预处理创建pipeline | |||||
| model = Model.from_pretrained('damo/xxx') | |||||
| preprocessor = Preprocessor.from_pretrained(cfg) | |||||
| img_matting = pipeline('image-matting', model=model, preprocessor=preprocessor) | |||||
| # 推理 | |||||
| result = img_matting( | result = img_matting( | ||||
| 'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png' | |||||
| ) | |||||
| # 保存结果图片 | |||||
| 'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png' | |||||
| ) | |||||
| cv2.imwrite('result.png', result['output_png']) | cv2.imwrite('result.png', result['output_png']) | ||||
| ``` --> | |||||
| print(f'result file path is {osp.abspath("result.png")}') | |||||
| ``` | |||||
| @@ -27,7 +27,7 @@ | |||||
| 执行如下python代码 | 执行如下python代码 | ||||
| ```python | ```python | ||||
| >>> from maas_lib.pipelines import pipeline | >>> from maas_lib.pipelines import pipeline | ||||
| >>> img_matting = pipeline(task='image-matting', model_path='matting_person.pb') | |||||
| >>> img_matting = pipeline(task='image-matting', model='damo/image-matting-person') | |||||
| ``` | ``` | ||||
| 2. 传入单张图像url进行处理 | 2. 传入单张图像url进行处理 | ||||
| @@ -35,6 +35,8 @@ | |||||
| >>> import cv2 | >>> import cv2 | ||||
| >>> result = img_matting('http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png') | >>> result = img_matting('http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/test/maas/image_matting/test.png') | ||||
| >>> cv2.imwrite('result.png', result['output_png']) | >>> cv2.imwrite('result.png', result['output_png']) | ||||
| >>> import os.path as osp | |||||
| >>> print(f'result file path is {osp.abspath("result.png")}') | |||||
| ``` | ``` | ||||
| pipeline对象也支持传入一个列表输入,返回对应输出列表,每个元素对应输入样本的返回结果 | pipeline对象也支持传入一个列表输入,返回对应输出列表,每个元素对应输入样本的返回结果 | ||||
| @@ -57,10 +59,12 @@ | |||||
| pipeline函数支持传入实例化的预处理对象、模型对象,从而支持用户在推理过程中定制化预处理、模型。 | pipeline函数支持传入实例化的预处理对象、模型对象,从而支持用户在推理过程中定制化预处理、模型。 | ||||
| 下面以文本情感分类为例进行介绍。 | 下面以文本情感分类为例进行介绍。 | ||||
| 由于demo模型为EasyNLP提供的模型,首先,安装EasyNLP | |||||
| ```shell | |||||
| pip install https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/package/whl/easynlp-0.0.4-py2.py3-none-any.whl | |||||
| ``` | |||||
| 注: 当前release版本还未实现AutoModel的语法糖,需要手动实例化模型,后续会加上对应语法糖简化调用 | |||||
| 下载模型文件 | 下载模型文件 | ||||
| ```shell | ```shell | ||||
| wget https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/easynlp_modelzoo/alibaba-pai/bert-base-sst2.zip && unzip bert-base-sst2.zip | wget https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/easynlp_modelzoo/alibaba-pai/bert-base-sst2.zip && unzip bert-base-sst2.zip | ||||
| @@ -68,18 +72,17 @@ wget https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/easynlp_modelz | |||||
| 创建tokenizer和模型 | 创建tokenizer和模型 | ||||
| ```python | ```python | ||||
| >>> from maas_lib.models.nlp import SequenceClassificationModel | |||||
| >>> path = 'bert-base-sst2' | |||||
| >>> model = SequenceClassificationModel(path) | |||||
| >>> from maas_lib.models import Model | |||||
| >>> from maas_lib.preprocessors import SequenceClassificationPreprocessor | >>> from maas_lib.preprocessors import SequenceClassificationPreprocessor | ||||
| >>> model = Model.from_pretrained('damo/bert-base-sst2') | |||||
| >>> tokenizer = SequenceClassificationPreprocessor( | >>> tokenizer = SequenceClassificationPreprocessor( | ||||
| path, first_sequence='sentence', second_sequence=None) | |||||
| model.model_dir, first_sequence='sentence', second_sequence=None) | |||||
| ``` | ``` | ||||
| 使用tokenizer和模型对象创建pipeline | 使用tokenizer和模型对象创建pipeline | ||||
| ```python | ```python | ||||
| >>> from maas_lib.pipelines import pipeline | >>> from maas_lib.pipelines import pipeline | ||||
| >>> semantic_cls = pipeline('text-classification', model=model, preprocessor=tokenizer) | |||||
| >>> semantic_cls = pipeline('text-classification', model=model, preprocessor=tokenizer) | |||||
| >>> semantic_cls("Hello world!") | >>> semantic_cls("Hello world!") | ||||
| ``` | ``` | ||||
| @@ -2,3 +2,4 @@ | |||||
| from .base import Model | from .base import Model | ||||
| from .builder import MODELS, build_model | from .builder import MODELS, build_model | ||||
| from .nlp import SequenceClassificationModel | |||||
| @@ -1,7 +1,6 @@ | |||||
| from typing import Any, Dict, Optional, Union | from typing import Any, Dict, Optional, Union | ||||
| import numpy as np | import numpy as np | ||||
| import torch | |||||
| from maas_lib.utils.constant import Tasks | from maas_lib.utils.constant import Tasks | ||||
| from ..base import Model | from ..base import Model | ||||
| @@ -26,6 +25,7 @@ class SequenceClassificationModel(Model): | |||||
| super().__init__(model_dir, *args, **kwargs) | super().__init__(model_dir, *args, **kwargs) | ||||
| from easynlp.appzoo import SequenceClassification | from easynlp.appzoo import SequenceClassification | ||||
| from easynlp.core.predictor import get_model_predictor | from easynlp.core.predictor import get_model_predictor | ||||
| import torch | |||||
| self.model = get_model_predictor( | self.model = get_model_predictor( | ||||
| model_dir=self.model_dir, | model_dir=self.model_dir, | ||||
| model_cls=SequenceClassification, | model_cls=SequenceClassification, | ||||
| @@ -4,7 +4,6 @@ from typing import Any, Dict, List, Tuple, Union | |||||
| import cv2 | import cv2 | ||||
| import numpy as np | import numpy as np | ||||
| import PIL | import PIL | ||||
| import tensorflow as tf | |||||
| from cv2 import COLOR_GRAY2RGB | from cv2 import COLOR_GRAY2RGB | ||||
| from maas_lib.pipelines.base import Input | from maas_lib.pipelines.base import Input | ||||
| @@ -14,9 +13,6 @@ from maas_lib.utils.logger import get_logger | |||||
| from ..base import Pipeline | from ..base import Pipeline | ||||
| from ..builder import PIPELINES | from ..builder import PIPELINES | ||||
| if tf.__version__ >= '2.0': | |||||
| tf = tf.compat.v1 | |||||
| logger = get_logger() | logger = get_logger() | ||||
| @@ -26,6 +22,9 @@ class ImageMatting(Pipeline): | |||||
| def __init__(self, model: str): | def __init__(self, model: str): | ||||
| super().__init__(model=model) | super().__init__(model=model) | ||||
| import tensorflow as tf | |||||
| if tf.__version__ >= '2.0': | |||||
| tf = tf.compat.v1 | |||||
| model_path = osp.join(self.model, 'matting_person.pb') | model_path = osp.join(self.model, 'matting_person.pb') | ||||
| config = tf.ConfigProto(allow_soft_placement=True) | config = tf.ConfigProto(allow_soft_placement=True) | ||||
| @@ -1 +1 @@ | |||||
| __version__ = '0.1.0' | |||||
| __version__ = '0.1.1' | |||||
| @@ -0,0 +1,2 @@ | |||||
| http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/release/maas/maas_lib-0.1.1-py3-none-any.whl | |||||
| https://maashub.oss-cn-hangzhou.aliyuncs.com/releases/maas_hub-0.1.0.dev0-py2.py3-none-any.whl | |||||
| @@ -1,6 +1,6 @@ | |||||
| #https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/package/whl/easynlp-0.0.4-py2.py3-none-any.whl | #https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/package/whl/easynlp-0.0.4-py2.py3-none-any.whl | ||||
| tensorflow | |||||
| # tensorflow | |||||
| #--find-links https://download.pytorch.org/whl/torch_stable.html | #--find-links https://download.pytorch.org/whl/torch_stable.html | ||||
| torch<1.10,>=1.8.0 | |||||
| torchaudio | |||||
| torchvision | |||||
| # torch<1.10,>=1.8.0 | |||||
| # torchaudio | |||||
| # torchvision | |||||
| @@ -5,5 +5,6 @@ opencv-python-headless | |||||
| Pillow | Pillow | ||||
| pyyaml | pyyaml | ||||
| requests | requests | ||||
| transformers | |||||
| tokenizers<=0.10.3 | |||||
| transformers<=4.16.2 | |||||
| yapf | yapf | ||||