You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

quick.rst 8.2 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218
  1. .. _quick:
  2. ============================================================
  3. Quick Start
  4. ============================================================
  5. Introduction
  6. ====================
  7. This ``Quick Start`` guide aims to illustrate the straightforward process of establishing a full ``Learnware Market`` workflow
  8. and utilizing ``Learnware Market`` to handle user tasks.
  9. Installation
  10. ====================
  11. Learnware is currently hosted on `PyPI <https://pypi.org/>`__. You can easily intsall ``learnware`` by following these steps:
  12. - For Windows and Linux users:
  13. .. code-block::
  14. pip install learnware
  15. - For macOS users:
  16. .. code-block::
  17. conda install -c pytorch faiss
  18. pip install learnware
  19. Prepare Learnware
  20. ====================
  21. The Learnware Market encompasses a board variety of learnwares. A valid learnware is a zipfile that
  22. includes the following four components:
  23. - ``__init__.py``
  24. A Python file that provides interfaces for fitting, predicting, and fine-tuning your model.
  25. - ``rkme.json``
  26. A JSON file that contains the statistical specification of your data.
  27. - ``learnware.yaml``
  28. A configuration file that details your model's class name, the type of statistical specification(e.g. ``RKMETableSpecification`` for Reduced Kernel Mean Embedding), and
  29. the file name of your statistical specification file.
  30. - ``environment.yaml`` or ``requirements.txt``
  31. - ``environment.yaml`` for conda:
  32. A Conda environment configuration file for running the model. If the model environment is incompatible, this file can be used for manual configuration.
  33. Here's how you can generate this file:
  34. - Create env config for conda:
  35. - For Windows users:
  36. .. code-block::
  37. conda env export | findstr /v "^prefix: " > environment.yaml
  38. - For macOS and Linux users
  39. .. code-block::
  40. conda env export | grep -v "^prefix: " > environment.yaml
  41. - Recover env from config:
  42. .. code-block::
  43. conda env create -f environment.yaml
  44. - ``requirements.txt`` for pip:
  45. A plain text documents that lists all packages necessary for executing the model. These dependencies can be effortlessly installed using pip with the command:
  46. .. code-block::
  47. pip install -r requirements.txt.
  48. We've also detailed the format of the learnware zipfile in :ref:`Learnware Preparation<workflows/upload:Prepare Learnware>`.
  49. Learnware Market Workflow
  50. ============================
  51. Users can start a ``Learnware Market`` workflow according to the following steps:
  52. Initialize a Learnware Market
  53. -------------------------------
  54. The ``EasyMarket`` class provides the core functions of a ``Learnware Market``.
  55. You can initialize a basic ``Learnware Market`` named "demo" using the code snippet below:
  56. .. code-block:: python
  57. import learnware
  58. from learnware.market import EasyMarket
  59. learnware.init()
  60. easy_market = EasyMarket(market_id="demo", rebuild=True)
  61. Upload Leanware
  62. -------------------------------
  63. Before uploading your learnware to the ``Learnware Market``,
  64. you'll need to create a semantic specification, ``semantic_spec``. This involves selecting or inputting values for predefined semantic tags
  65. to describe the features of your task and model.
  66. For instance, the dictionary snippet below illustrates the semantic specification for a Scikit-Learn type model.
  67. This model is tailored for business scenarios and performs classification tasks on tabular data:
  68. .. code-block:: python
  69. semantic_spec = {
  70. "Data": {"Values": ["Tabular"], "Type": "Class"},
  71. "Task": {"Values": ["Classification"], "Type": "Class"},
  72. "Library": {"Values": ["Scikit-learn"], "Type": "Class"},
  73. "Scenario": {"Values": ["Business"], "Type": "Tag"},
  74. "Description": {"Values": "", "Type": "String"},
  75. "Name": {"Values": "demo_learnware", "Type": "String"},
  76. }
  77. After defining the semantic specification,
  78. you can upload your learnware using a single line of code:
  79. .. code-block:: python
  80. easy_market.add_learnware(zip_path, semantic_spec)
  81. Here, ``zip_path`` is the directory of your learnware zipfile.
  82. Semantic Specification Search
  83. -------------------------------
  84. To find learnwares that align with your task's purpose, you'll need to provide a semantic specification, ``user_semantic``, that outlines your task's characteristics.
  85. The ``Learnware Market`` will then perform an initial search using ``user_semantic``, identifying potentially useful learnwares with models that solve tasks similar to your requirements.
  86. .. code-block:: python
  87. # construct user_info which includes a semantic specification
  88. user_info = BaseUserInfo(id="user", semantic_spec=semantic_spec)
  89. # search_learnware: performs semantic specification search when user_info doesn't include a statistical specification
  90. _, single_learnware_list, _ = easy_market.search_learnware(user_info)
  91. # single_learnware_list: the learnware list returned by semantic specification search
  92. print(single_learnware_list)
  93. Statistical Specification Search
  94. ---------------------------------
  95. If you decide in favor of porviding your own statistical specification file, ``stat.json``,
  96. the ``Learnware Market`` can further refine the selection of learnwares from the previous step.
  97. This second-stage search leverages statistical information to identify one or more learnwares that are most likely to be beneficial for your task.
  98. For example, the code below executes learnware search when using Reduced Set Kernel Embedding as the statistical specification:
  99. .. code-block:: python
  100. import learnware.specification as specification
  101. user_spec = specification.RKMETableSpecification()
  102. # unzip_path: directory for unzipped learnware zipfile
  103. user_spec.load(os.path.join(unzip_path, "rkme.json"))
  104. user_info = BaseUserInfo(
  105. semantic_spec=user_semantic, stat_info={"RKMETableSpecification": user_spec}
  106. )
  107. (sorted_score_list, single_learnware_list,
  108. mixture_score, mixture_learnware_list) = easy_market.search_learnware(user_info)
  109. # sorted_score_list: learnware scores(based on MMD distances), sorted in descending order
  110. print(sorted_score_list)
  111. # single_learnware_list: learnwares, sorted by scores in descending order
  112. print(single_learnware_list)
  113. # mixture_learnware_list: collection of learnwares whose combined use is beneficial
  114. print(mixture_learnware_list)
  115. # mixture_score: score assigned to the combined set of learnwares in `mixture_learnware_list`
  116. print(mixture_score)
  117. Reuse Learnwares
  118. -------------------------------
  119. With the list of learnwares, ``mixture_learnware_list``, returned from the previous step, you can readily apply them to make predictions on your own data, bypassing the need to train a model from scratch.
  120. We offer two baseline methods for reusing a given list of learnwares: ``JobSelectorReuser`` and ``AveragingReuser``.
  121. Just substitute ``test_x`` in the code snippet below with your own testing data, and you're all set to reuse learnwares!
  122. .. code-block:: python
  123. # using jobselector reuser to reuse the searched learnwares to make prediction
  124. reuse_job_selector = JobSelectorReuser(learnware_list=mixture_learnware_list)
  125. job_selector_predict_y = reuse_job_selector.predict(user_data=test_x)
  126. # using averaging ensemble reuser to reuse the searched learnwares to make prediction
  127. reuse_ensemble = AveragingReuser(learnware_list=mixture_learnware_list)
  128. ensemble_predict_y = reuse_ensemble.predict(user_data=test_x)
  129. Auto Workflow Example
  130. ============================
  131. The ``Learnware Market`` also offers an automated workflow example.
  132. This includes preparing learnwares, uploading and deleting learnwares from the market, and searching for learnwares using both semantic and statistical specifications.
  133. To experience the basic workflow of the Learnware Market, users can run [workflow code link].