[DOC] update quick start

1 year ago · 2b3e646f36
--- a/docs/about/dev.rst
+++ b/docs/about/dev.rst
@@ -10,8 +10,12 @@ As a developer, you often want make changes to ``Learnware Market`` and hope it

 .. code-block:: bash
    
    $ git clone https://github.com/Learnware-LAMDA/Learnware.git && cd learnware
    $ python setup.py install
    $ git clone https://github.com/Learnware-LAMDA/Learnware.git && cd Learnware
    $ pip install -e .[dev]

 .. note::
   It's recommended to use anaconda/miniconda to setup the environment. Also you can run ``pip install -e .[full, dev]`` to install ``torch`` automatically.


 Commit Format
 ==============
--- a/docs/components/spec.rst
+++ b/docs/components/spec.rst
@@ -37,6 +37,9 @@ Semantic Specification
 The semantic specification consists of a "dict" structure that includes keywords "Data", "Task", "Library", "Scenario", "License", "Description", and "Name". 
 In the case of table learnwares, users should additionally provide descriptions for each feature dimension and output dimension through the "Input" and "Output" keywords.

 - If "data_type" is "Table", you need to specify the semantics of each dimension of the model's input data to make the uploaded learnware suitable for tasks with heterogeneous feature spaces.
 - If "task_type" is "Classification", you need to provide the semantics of model output labels (prediction labels start from 0), making the uploaded learnware suitable for classification tasks with heterogeneous output spaces.
 - If "task_type" is "Regression", you need to specify the semantics of each dimension of the model output, making the uploaded learnware suitable for regression tasks with heterogeneous output spaces.

 Regular Specification
 ======================================
--- a/docs/start/install.rst
+++ b/docs/start/install.rst
@@ -16,6 +16,18 @@ Users can easily install ``Learnware`` by pip according to the following command

    pip install learnware

 In the ``Learnware`` package, besides the base classes, many core functionalities such as "learnware specification generation" and "learnware deployment" rely on the ``torch`` library. Users have the option to manually install ``torch``, or they can directly use the following command to install the ``learnware`` package:

 .. code-block:: bash

    pip install learnware[full]

 .. note:: 
    However, it's crucial to note that due to the potential complexity of the user's local environment, installing ``learnware[full]`` does not guarantee that ``torch`` will successfully invoke ``CUDA`` in the user's local setting.


 Install ``Learnware`` Package From Source
 ==========================================

 Also, Users can install ``Learnware`` by the source code according to the following steps:

@@ -24,11 +36,11 @@ Also, Users can install ``Learnware`` by the source code according to the follow

    .. code-block:: bash
        
        $ git clone hhttps://github.com/Learnware-LAMDA/Learnware.git && cd learnware
        $ python setup.py install
        $ git clone hhttps://github.com/Learnware-LAMDA/Learnware.git && cd Learnware
        $ pip install -e .[dev]

 .. note::
   It's recommended to use anaconda/miniconda to setup the environment.
   It's recommended to use anaconda/miniconda to setup the environment. Also you can run ``pip install -e .[full, dev]`` to install ``torch`` automatically as well.

 Use the following code to make sure the installation successful:

--- a/docs/start/quick.rst
+++ b/docs/start/quick.rst
@@ -16,76 +16,30 @@ Installation

 Learnware is currently hosted on `PyPI <https://pypi.org/>`_. You can easily intsall ``Learnware`` by following these steps:

 - For Windows and Linux users:
 .. code-block:: bash

    .. code-block::
    pip install learnware

        pip install learnware
 In the ``Learnware`` package, besides the base classes, many core functionalities such as "learnware specification generation" and "learnware deployment" rely on the ``torch`` library. Users have the option to manually install ``torch``, or they can directly use the following command to install the ``learnware`` package:

 - For macOS users:
 .. code-block:: bash

    .. code-block::

        conda install -c pytorch faiss
        pip install learnware
    pip install learnware[full]

 .. note:: 
    However, it's crucial to note that due to the potential complexity of the user's local environment, installing ``learnware[full]`` does not guarantee that ``torch`` will successfully invoke ``CUDA`` in the user's local setting.

 Prepare Learnware
 ====================

 The Learnware Market encompasses a board variety of learnwares. A valid learnware is a zipfile that
 includes the following four components:

 - ``__init__.py``

    A Python file that provides interfaces for fitting, predicting, and fine-tuning your model.

 - ``rkme.json``

    A JSON file that contains the statistical specification of your data. 

 - ``learnware.yaml``
    
    A configuration file that details your model's class name, the type of statistical specification(e.g. ``RKMETableSpecification`` for Reduced Kernel Mean Embedding), and 
    the file name of your statistical specification file.

 - ``environment.yaml`` or ``requirements.txt``

    - ``environment.yaml`` for conda:

        A Conda environment configuration file for running the model. If the model environment is incompatible, this file can be used for manual configuration. 
        Here's how you can generate this file:

        - Create env config for conda:

            - For Windows users:
            
            .. code-block::

                conda env export | findstr /v "^prefix: " > environment.yaml
            
            - For macOS and Linux users

            .. code-block::

                conda env export | grep -v "^prefix: " > environment.yaml
            
        - Recover env from config:

        .. code-block::

            conda env create -f environment.yaml
    
    - ``requirements.txt`` for pip:

        A plain text documents that lists all packages necessary for executing the model. These dependencies can be effortlessly installed using pip with the command:

        .. code-block::
        
            pip install -r requirements.txt
 In learnware ``Learnware`` package, each learnware is encapsulated in a ``zip`` package, which should contain at least the following four files:

 We've also detailed the format of the learnware zipfile in :ref:`Learnware Preparation<workflows/upload:Prepare Learnware>`.
 - ``learnware.yaml``: learnware configuration file.
 - ``__init__.py``: methods for using the model.
 - ``stat.json``: the statistical specification of the learnware. Its filename can be customized and recorded in learnware.yaml.
 - ``environment.yaml`` or ``requirements.txt``: specifies the environment for the model.

 To facilitate the construction of a learnware, we provide a `Learnware Template <https://www.bmwu.cloud/static/learnware-template.zip>`_ that the users can use as a basis for building your own learnware.  We've also detailed the format of the learnware ``zip`` package in `Learnware Preparation<../workflows/upload:prepare-learnware>`.

 Learnware Package Workflow
 ============================
@@ -100,11 +54,10 @@ You can initialize a basic ``Learnware Market`` named "demo" using the code snip

 .. code-block:: python
    
    import learnware
    from learnware.market import EasyMarket
    from learnware.market import instantiate_learnware_market

    learnware.init()
    easy_market = EasyMarket(market_id="demo", rebuild=True)
    # instantiate a demo market
    demo_market = instantiate_learnware_market(market_id="demo", name="easy", rebuild=True) 


 Upload Leanware
@@ -114,28 +67,30 @@ Before uploading your learnware to the ``Learnware Market``,
 you'll need to create a semantic specification, ``semantic_spec``. This involves selecting or inputting values for predefined semantic tags 
 to describe the features of your task and model.

 For instance, the dictionary snippet below illustrates the semantic specification for a Scikit-Learn type model. 
 This model is tailored for business scenarios and performs classification tasks on tabular data:
 For instance, the following codes illustrates the semantic specification for a Scikit-Learn type model. 
 This model is tailored for education scenarios and performs classification tasks on tabular data:

 .. code-block:: python

    semantic_spec = {
        "Data": {"Values": ["Tabular"], "Type": "Class"},
        "Task": {"Values": ["Classification"], "Type": "Class"},
        "Library": {"Values": ["Scikit-learn"], "Type": "Class"},
        "Scenario": {"Values": ["Business"], "Type": "Tag"},
        "Description": {"Values": "", "Type": "String"},
        "Name": {"Values": "demo_learnware", "Type": "String"},
    }
    from learnware.specification import generate_semantic_spec

    semantic_spec = generate_semantic_spec(
        name="demo_learnware",
        data_type="Table",
        task_type="Classification",
        library_type="Scikit-learn",
        scenarios="Education",
        license="MIT",
    )

 After defining the semantic specification, 
 you can upload your learnware using a single line of code:
    
 .. code-block:: python
    
    easy_market.add_learnware(zip_path, semantic_spec) 

 Here, ``zip_path`` is the directory of your learnware zipfile.
    demo_market.add_learnware(zip_path, semantic_spec) 

 Here, ``zip_path`` is the directory of your learnware ``zip`` package.


 Semantic Specification Search
@@ -150,10 +105,11 @@ The ``Learnware Market`` will then perform an initial search using ``user_semant
    user_info = BaseUserInfo(id="user", semantic_spec=semantic_spec)

    # search_learnware: performs semantic specification search when user_info doesn't include a statistical specification
    _, single_learnware_list, _ = easy_market.search_learnware(user_info) 
    search_result = easy_market.search_learnware(user_info) 
    single_result = search_results.get_single_results()

    # single_learnware_list: the learnware list returned by semantic specification search
    print(single_learnware_list)
    # single_result: the List of Tuple[Score, Learnware] returned by semantic specification search
    print(single_result)
    

 Statistical Specification Search
@@ -176,31 +132,35 @@ For example, the code below executes learnware search when using Reduced Set Ker
    user_info = BaseUserInfo(
        semantic_spec=user_semantic, stat_info={"RKMETableSpecification": user_spec}
    )
    (sorted_score_list, single_learnware_list,
        mixture_score, mixture_learnware_list) = easy_market.search_learnware(user_info)
    search_result = easy_market.search_learnware(user_info)

    # sorted_score_list: learnware scores(based on MMD distances), sorted in descending order
    print(sorted_score_list) 
    single_result = search_results.get_single_results()
    multiple_result = search_results.get_multiple_results()

    # single_learnware_list: learnwares, sorted by scores in descending order
    print(single_learnware_list)
    # search_item.score: based on MMD distances, sorted in descending order
    # search_item.learnware.id: id of learnwares, sorted by scores in descending order
    for search_item in single_result:
        print(f"score: {search_item.score}, learnware_id: {search_item.learnware.id}")

    # mixture_learnware_list: collection of learnwares whose combined use is beneficial
    print(mixture_learnware_list) 

    # mixture_score: score assigned to the combined set of learnwares in `mixture_learnware_list`
    print(mixture_score)
    # mixture_item.learnwares: collection of learnwares whose combined use is beneficial
    # mixture_item.score: score assigned to the combined set of learnwares in `mixture_item.learnwares`
    for mixture_item in multiple_result:
        print(f"mixture_score: {mixture_item.score}\n")
        mixture_id = " ".join([learnware.id for learnware in mixture_item.learnwares])
        print(f"mixture_learnware: {mixture_id}\n")


 Reuse Learnwares
 -------------------------------

 With the list of learnwares, ``mixture_learnware_list``, returned from the previous step, you can readily apply them to make predictions on your own data, bypassing the need to train a model from scratch. 
 We offer two baseline methods for reusing a given list of learnwares: ``JobSelectorReuser`` and ``AveragingReuser``. 
 Just substitute ``test_x`` in the code snippet below with your own testing data, and you're all set to reuse learnwares!
 We offer provide two methods for reusing a given list of learnwares: ``JobSelectorReuser`` and ``AveragingReuser``. 
 Just substitute ``test_x`` in the code snippet below with your own testing data, and you're all set to reuse learnwares:

 .. code-block:: python

    from learnware.reuse import JobSelectorReuser, AveragingReuser

    # using jobselector reuser to reuse the searched learnwares to make prediction
    reuse_job_selector = JobSelectorReuser(learnware_list=mixture_learnware_list)
    job_selector_predict_y = reuse_job_selector.predict(user_data=test_x)
@@ -210,6 +170,25 @@ Just substitute ``test_x`` in the code snippet below with your own testing data,
    ensemble_predict_y = reuse_ensemble.predict(user_data=test_x)


 We also provide two method when the user has labeled data for reusing a given list of learnwares: ``EnsemblePruningReuser`` and ``FeatureAugmentReuser``.
 Just substitute ``test_x`` in the code snippet below with your own testing data, and substitute ``train_X, train_y`` with your own training labeled data, and you're all set to reuse learnwares:

 .. code-block:: python

    from learnware.reuse import EnsemblePruningReuser, FeatureAugmentReuser

    # Use ensemble pruning reuser to reuse the searched learnwares to make prediction
    reuse_ensemble = EnsemblePruningReuser(learnware_list=mixture_item.learnwares, mode="classification")
    reuse_ensemble.fit(train_X, train_y)
    ensemble_pruning_predict_y = reuse_ensemble.predict(user_data=data_X)

    # Use feature augment reuser to reuse the searched learnwares to make prediction
    reuse_feature_augment = FeatureAugmentReuser(learnware_list=mixture_item.learnwares, mode="classification")
    reuse_feature_augment.fit(train_X, train_y)
    feature_augment_predict_y = reuse_feature_augment.predict(user_data=data_X)



 Auto Workflow Example
 ============================

--- a/docs/workflows/upload.rst
+++ b/docs/workflows/upload.rst
@@ -7,7 +7,7 @@ In this section, we provide a comprehensive guide on submitting your custom lear
 We will first discuss the necessary components of a valid learnware, followed by a detailed explanation on how to upload and remove learnwares within ``Learnware Market``.


 Prepare Learnware ``Zip`` Package
 Prepare Learnware
 ====================================

 In learnware ``Learnware`` package, each learnware is encapsulated in a ``zip`` package, which should contain at least the following four files:
@@ -196,12 +196,59 @@ Please note that if you use the ``requirements.txt`` file to specify runtime dep

 Furthermore, for version-sensitive packages like ``torch``, it's essential to specify package versions in the ``requirements.txt`` file to ensure successful deployment of the uploaded learnware on other machines.

 Upload Learnware ``Zip`` Package
 Upload Learnware
 ==================================

 After preparing the four required files mentioned above, 
 you can bundle them into your own learnware ``zip`` package. Along with the generated semantic specification that 
 succinctly describes the features of your task and model (for more details, please refer to :ref:`semantic specification<components/spec:Semantic Specification>`), 
 After preparing the four required files mentioned above, you can bundle them into your own learnware ``zip`` package.

 Prepare Sematic Specifcation
 -----------------------------

 The semantic specification succinctly describes the features of your task and model. For uploading learnware ``zip`` package, the user need to prepare the semantic specification. Here is an example of a "Table Data" for a "Classification Task":

 .. code-block:: python

    from learnware.specification import generate_semantic_spec

    # Prepare input description when data_type="Table"
    input_description = {
        "Dimension": 5,
        "Description": {
            "0": "age",
            "1": "weight",
            "2": "body length",
            "3": "animal type",
            "4": "claw length"
        },
    }

    # Prepare output description when task_type in ["Classification", "Regression"]
    output_description = {
        "Dimension": 3,
        "Description": {
            "0": "cat",
            "1": "dog",
            "2": "bird",
        },
    }

    # Create semantic specification
    semantic_spec = generate_semantic_spec(
        name="learnware_example",
        description="Just an example for uploading learnware",
        data_type="Table",
        task_type="Classification",
        library_type="Scikit-learn",
        scenarios=["Business", "Financial"],
        input_description=input_description,
        output_description=output_description,
    )

 For more details, please refer to :ref:`semantic specification<components/spec:Semantic Specification>`, 

 Uploading
 --------------

 you can effortlessly upload your learnware to the ``Learnware Market`` as follows.

 .. code-block:: python