Browse Source

[DOC] update quick start

tags/v0.3.2
bxdd 1 year ago
parent
commit
2b3e646f36
5 changed files with 147 additions and 102 deletions
  1. +6
    -2
      docs/about/dev.rst
  2. +3
    -0
      docs/components/spec.rst
  3. +15
    -3
      docs/start/install.rst
  4. +71
    -92
      docs/start/quick.rst
  5. +52
    -5
      docs/workflows/upload.rst

+ 6
- 2
docs/about/dev.rst View File

@@ -10,8 +10,12 @@ As a developer, you often want make changes to ``Learnware Market`` and hope it

.. code-block:: bash
$ git clone https://github.com/Learnware-LAMDA/Learnware.git && cd learnware
$ python setup.py install
$ git clone https://github.com/Learnware-LAMDA/Learnware.git && cd Learnware
$ pip install -e .[dev]

.. note::
It's recommended to use anaconda/miniconda to setup the environment. Also you can run ``pip install -e .[full, dev]`` to install ``torch`` automatically.


Commit Format
==============


+ 3
- 0
docs/components/spec.rst View File

@@ -37,6 +37,9 @@ Semantic Specification
The semantic specification consists of a "dict" structure that includes keywords "Data", "Task", "Library", "Scenario", "License", "Description", and "Name".
In the case of table learnwares, users should additionally provide descriptions for each feature dimension and output dimension through the "Input" and "Output" keywords.

- If "data_type" is "Table", you need to specify the semantics of each dimension of the model's input data to make the uploaded learnware suitable for tasks with heterogeneous feature spaces.
- If "task_type" is "Classification", you need to provide the semantics of model output labels (prediction labels start from 0), making the uploaded learnware suitable for classification tasks with heterogeneous output spaces.
- If "task_type" is "Regression", you need to specify the semantics of each dimension of the model output, making the uploaded learnware suitable for regression tasks with heterogeneous output spaces.

Regular Specification
======================================


+ 15
- 3
docs/start/install.rst View File

@@ -16,6 +16,18 @@ Users can easily install ``Learnware`` by pip according to the following command

pip install learnware

In the ``Learnware`` package, besides the base classes, many core functionalities such as "learnware specification generation" and "learnware deployment" rely on the ``torch`` library. Users have the option to manually install ``torch``, or they can directly use the following command to install the ``learnware`` package:

.. code-block:: bash

pip install learnware[full]

.. note::
However, it's crucial to note that due to the potential complexity of the user's local environment, installing ``learnware[full]`` does not guarantee that ``torch`` will successfully invoke ``CUDA`` in the user's local setting.


Install ``Learnware`` Package From Source
==========================================

Also, Users can install ``Learnware`` by the source code according to the following steps:

@@ -24,11 +36,11 @@ Also, Users can install ``Learnware`` by the source code according to the follow

.. code-block:: bash
$ git clone hhttps://github.com/Learnware-LAMDA/Learnware.git && cd learnware
$ python setup.py install
$ git clone hhttps://github.com/Learnware-LAMDA/Learnware.git && cd Learnware
$ pip install -e .[dev]

.. note::
It's recommended to use anaconda/miniconda to setup the environment.
It's recommended to use anaconda/miniconda to setup the environment. Also you can run ``pip install -e .[full, dev]`` to install ``torch`` automatically as well.

Use the following code to make sure the installation successful:



+ 71
- 92
docs/start/quick.rst View File

@@ -16,76 +16,30 @@ Installation

Learnware is currently hosted on `PyPI <https://pypi.org/>`_. You can easily intsall ``Learnware`` by following these steps:

- For Windows and Linux users:
.. code-block:: bash

.. code-block::
pip install learnware

pip install learnware
In the ``Learnware`` package, besides the base classes, many core functionalities such as "learnware specification generation" and "learnware deployment" rely on the ``torch`` library. Users have the option to manually install ``torch``, or they can directly use the following command to install the ``learnware`` package:

- For macOS users:
.. code-block:: bash

.. code-block::

conda install -c pytorch faiss
pip install learnware
pip install learnware[full]

.. note::
However, it's crucial to note that due to the potential complexity of the user's local environment, installing ``learnware[full]`` does not guarantee that ``torch`` will successfully invoke ``CUDA`` in the user's local setting.

Prepare Learnware
====================

The Learnware Market encompasses a board variety of learnwares. A valid learnware is a zipfile that
includes the following four components:

- ``__init__.py``

A Python file that provides interfaces for fitting, predicting, and fine-tuning your model.

- ``rkme.json``

A JSON file that contains the statistical specification of your data.

- ``learnware.yaml``
A configuration file that details your model's class name, the type of statistical specification(e.g. ``RKMETableSpecification`` for Reduced Kernel Mean Embedding), and
the file name of your statistical specification file.

- ``environment.yaml`` or ``requirements.txt``

- ``environment.yaml`` for conda:

A Conda environment configuration file for running the model. If the model environment is incompatible, this file can be used for manual configuration.
Here's how you can generate this file:

- Create env config for conda:

- For Windows users:
.. code-block::

conda env export | findstr /v "^prefix: " > environment.yaml
- For macOS and Linux users

.. code-block::

conda env export | grep -v "^prefix: " > environment.yaml
- Recover env from config:

.. code-block::

conda env create -f environment.yaml
- ``requirements.txt`` for pip:

A plain text documents that lists all packages necessary for executing the model. These dependencies can be effortlessly installed using pip with the command:

.. code-block::
pip install -r requirements.txt
In learnware ``Learnware`` package, each learnware is encapsulated in a ``zip`` package, which should contain at least the following four files:

We've also detailed the format of the learnware zipfile in :ref:`Learnware Preparation<workflows/upload:Prepare Learnware>`.
- ``learnware.yaml``: learnware configuration file.
- ``__init__.py``: methods for using the model.
- ``stat.json``: the statistical specification of the learnware. Its filename can be customized and recorded in learnware.yaml.
- ``environment.yaml`` or ``requirements.txt``: specifies the environment for the model.

To facilitate the construction of a learnware, we provide a `Learnware Template <https://www.bmwu.cloud/static/learnware-template.zip>`_ that the users can use as a basis for building your own learnware. We've also detailed the format of the learnware ``zip`` package in `Learnware Preparation<../workflows/upload:prepare-learnware>`.

Learnware Package Workflow
============================
@@ -100,11 +54,10 @@ You can initialize a basic ``Learnware Market`` named "demo" using the code snip

.. code-block:: python
import learnware
from learnware.market import EasyMarket
from learnware.market import instantiate_learnware_market

learnware.init()
easy_market = EasyMarket(market_id="demo", rebuild=True)
# instantiate a demo market
demo_market = instantiate_learnware_market(market_id="demo", name="easy", rebuild=True)


Upload Leanware
@@ -114,28 +67,30 @@ Before uploading your learnware to the ``Learnware Market``,
you'll need to create a semantic specification, ``semantic_spec``. This involves selecting or inputting values for predefined semantic tags
to describe the features of your task and model.

For instance, the dictionary snippet below illustrates the semantic specification for a Scikit-Learn type model.
This model is tailored for business scenarios and performs classification tasks on tabular data:
For instance, the following codes illustrates the semantic specification for a Scikit-Learn type model.
This model is tailored for education scenarios and performs classification tasks on tabular data:

.. code-block:: python

semantic_spec = {
"Data": {"Values": ["Tabular"], "Type": "Class"},
"Task": {"Values": ["Classification"], "Type": "Class"},
"Library": {"Values": ["Scikit-learn"], "Type": "Class"},
"Scenario": {"Values": ["Business"], "Type": "Tag"},
"Description": {"Values": "", "Type": "String"},
"Name": {"Values": "demo_learnware", "Type": "String"},
}
from learnware.specification import generate_semantic_spec

semantic_spec = generate_semantic_spec(
name="demo_learnware",
data_type="Table",
task_type="Classification",
library_type="Scikit-learn",
scenarios="Education",
license="MIT",
)

After defining the semantic specification,
you can upload your learnware using a single line of code:
.. code-block:: python
easy_market.add_learnware(zip_path, semantic_spec)

Here, ``zip_path`` is the directory of your learnware zipfile.
demo_market.add_learnware(zip_path, semantic_spec)

Here, ``zip_path`` is the directory of your learnware ``zip`` package.


Semantic Specification Search
@@ -150,10 +105,11 @@ The ``Learnware Market`` will then perform an initial search using ``user_semant
user_info = BaseUserInfo(id="user", semantic_spec=semantic_spec)

# search_learnware: performs semantic specification search when user_info doesn't include a statistical specification
_, single_learnware_list, _ = easy_market.search_learnware(user_info)
search_result = easy_market.search_learnware(user_info)
single_result = search_results.get_single_results()

# single_learnware_list: the learnware list returned by semantic specification search
print(single_learnware_list)
# single_result: the List of Tuple[Score, Learnware] returned by semantic specification search
print(single_result)

Statistical Specification Search
@@ -176,31 +132,35 @@ For example, the code below executes learnware search when using Reduced Set Ker
user_info = BaseUserInfo(
semantic_spec=user_semantic, stat_info={"RKMETableSpecification": user_spec}
)
(sorted_score_list, single_learnware_list,
mixture_score, mixture_learnware_list) = easy_market.search_learnware(user_info)
search_result = easy_market.search_learnware(user_info)

# sorted_score_list: learnware scores(based on MMD distances), sorted in descending order
print(sorted_score_list)
single_result = search_results.get_single_results()
multiple_result = search_results.get_multiple_results()

# single_learnware_list: learnwares, sorted by scores in descending order
print(single_learnware_list)
# search_item.score: based on MMD distances, sorted in descending order
# search_item.learnware.id: id of learnwares, sorted by scores in descending order
for search_item in single_result:
print(f"score: {search_item.score}, learnware_id: {search_item.learnware.id}")

# mixture_learnware_list: collection of learnwares whose combined use is beneficial
print(mixture_learnware_list)

# mixture_score: score assigned to the combined set of learnwares in `mixture_learnware_list`
print(mixture_score)
# mixture_item.learnwares: collection of learnwares whose combined use is beneficial
# mixture_item.score: score assigned to the combined set of learnwares in `mixture_item.learnwares`
for mixture_item in multiple_result:
print(f"mixture_score: {mixture_item.score}\n")
mixture_id = " ".join([learnware.id for learnware in mixture_item.learnwares])
print(f"mixture_learnware: {mixture_id}\n")


Reuse Learnwares
-------------------------------

With the list of learnwares, ``mixture_learnware_list``, returned from the previous step, you can readily apply them to make predictions on your own data, bypassing the need to train a model from scratch.
We offer two baseline methods for reusing a given list of learnwares: ``JobSelectorReuser`` and ``AveragingReuser``.
Just substitute ``test_x`` in the code snippet below with your own testing data, and you're all set to reuse learnwares!
We offer provide two methods for reusing a given list of learnwares: ``JobSelectorReuser`` and ``AveragingReuser``.
Just substitute ``test_x`` in the code snippet below with your own testing data, and you're all set to reuse learnwares:

.. code-block:: python

from learnware.reuse import JobSelectorReuser, AveragingReuser

# using jobselector reuser to reuse the searched learnwares to make prediction
reuse_job_selector = JobSelectorReuser(learnware_list=mixture_learnware_list)
job_selector_predict_y = reuse_job_selector.predict(user_data=test_x)
@@ -210,6 +170,25 @@ Just substitute ``test_x`` in the code snippet below with your own testing data,
ensemble_predict_y = reuse_ensemble.predict(user_data=test_x)


We also provide two method when the user has labeled data for reusing a given list of learnwares: ``EnsemblePruningReuser`` and ``FeatureAugmentReuser``.
Just substitute ``test_x`` in the code snippet below with your own testing data, and substitute ``train_X, train_y`` with your own training labeled data, and you're all set to reuse learnwares:

.. code-block:: python

from learnware.reuse import EnsemblePruningReuser, FeatureAugmentReuser

# Use ensemble pruning reuser to reuse the searched learnwares to make prediction
reuse_ensemble = EnsemblePruningReuser(learnware_list=mixture_item.learnwares, mode="classification")
reuse_ensemble.fit(train_X, train_y)
ensemble_pruning_predict_y = reuse_ensemble.predict(user_data=data_X)

# Use feature augment reuser to reuse the searched learnwares to make prediction
reuse_feature_augment = FeatureAugmentReuser(learnware_list=mixture_item.learnwares, mode="classification")
reuse_feature_augment.fit(train_X, train_y)
feature_augment_predict_y = reuse_feature_augment.predict(user_data=data_X)



Auto Workflow Example
============================



+ 52
- 5
docs/workflows/upload.rst View File

@@ -7,7 +7,7 @@ In this section, we provide a comprehensive guide on submitting your custom lear
We will first discuss the necessary components of a valid learnware, followed by a detailed explanation on how to upload and remove learnwares within ``Learnware Market``.


Prepare Learnware ``Zip`` Package
Prepare Learnware
====================================

In learnware ``Learnware`` package, each learnware is encapsulated in a ``zip`` package, which should contain at least the following four files:
@@ -196,12 +196,59 @@ Please note that if you use the ``requirements.txt`` file to specify runtime dep

Furthermore, for version-sensitive packages like ``torch``, it's essential to specify package versions in the ``requirements.txt`` file to ensure successful deployment of the uploaded learnware on other machines.

Upload Learnware ``Zip`` Package
Upload Learnware
==================================

After preparing the four required files mentioned above,
you can bundle them into your own learnware ``zip`` package. Along with the generated semantic specification that
succinctly describes the features of your task and model (for more details, please refer to :ref:`semantic specification<components/spec:Semantic Specification>`),
After preparing the four required files mentioned above, you can bundle them into your own learnware ``zip`` package.

Prepare Sematic Specifcation
-----------------------------

The semantic specification succinctly describes the features of your task and model. For uploading learnware ``zip`` package, the user need to prepare the semantic specification. Here is an example of a "Table Data" for a "Classification Task":

.. code-block:: python

from learnware.specification import generate_semantic_spec

# Prepare input description when data_type="Table"
input_description = {
"Dimension": 5,
"Description": {
"0": "age",
"1": "weight",
"2": "body length",
"3": "animal type",
"4": "claw length"
},
}

# Prepare output description when task_type in ["Classification", "Regression"]
output_description = {
"Dimension": 3,
"Description": {
"0": "cat",
"1": "dog",
"2": "bird",
},
}

# Create semantic specification
semantic_spec = generate_semantic_spec(
name="learnware_example",
description="Just an example for uploading learnware",
data_type="Table",
task_type="Classification",
library_type="Scikit-learn",
scenarios=["Business", "Financial"],
input_description=input_description,
output_description=output_description,
)

For more details, please refer to :ref:`semantic specification<components/spec:Semantic Specification>`,

Uploading
--------------

you can effortlessly upload your learnware to the ``Learnware Market`` as follows.

.. code-block:: python


Loading…
Cancel
Save